May 16, 2005

Two Tables into One using Office2K Pro

In July 2002, I used Access 2000 to merge some BIOM ORDER tables with thirteen columns and over 19,000 records. Access is easier to use and more accurate for this than any Excel formula I tried. Some steps might be automated if there were a need (not so far), but the process is fairly quick once you have done it a few times. It takes me about an hour or two on a fast PC, as in 2 gigahertz with lots of memory.

Here is how I did this:


1.Verify the the data is properly delimited in Excel 2000 in two separate XLS files, that each column has a good heading with no punctuation, and that some heading could become the primary field (e.g. Key) and has an entry for every record in both tables. The common column is later used to define a critical one-to-many relationship between tables.

2.Create a temporary c:\projects directory, and then in Access 2000 To create new database "combined.mdb" in that directory. Close the dialogue asking how to create a table so you can do this manually. Use Windows Explorer to copy the source Excel files there as well.

3.In Access 2000 with Tables highlighted under the Objects section of the "combined:Database" window and three "create table" selections on the right, Insert -> Table -> Import Table, select the first Excel file, check the "First Row Contains Column Headings" box, let Access store the data "In a New Table", go with all defaults for field names, and let Access select/create a primary key (you can change it later).

4.Repeat for the second spreadsheet/table. Then rename the tables if the resulting names do not reflect their purposes. Fortunately these did: BIOM ORDER EXTRACT and BIOM ORDER NOTES EXTRACT.

5.Right-click the "parent" table - that is, the one in which that common column ('Key" here) contains only unique entries and no null entries. In this case BIOM ORDER EXTRACT was that parent table.

6.Select Design View, right-click the little box to the left of the name of the common column (again "Key" here), and make it the primary field. Then highlight the former (Access-generated) primary key and use "Delete Rows" to kill it. Close the table (not Access), saving changes.

7.Click Tools -> Relationships and add both new tables to that view. Resize and move windows so all fields can be seen, some space exists between the two, and the parent table (with unique "keys") is on the left. Primary keys should appear in bold. Click and drag the left "key" to the same-named "key" on the right and release.

8.Verify that "key" is the field selected from both tables. If not, cancel. Check the box to "enforce referential integrity" and click Create. The result should show a line between those fields with a "1" above it on the left or "parent" side, and an infinity sign on the right or "child" side. This indicates a new "one-to-many" relationship between tables.

Note: These relationships are at the heart of any "relational" database. If they are missing or configured badly, expect many problems. In a larger system (like ALEPH) there may be hundreds.

9.Close, save your changes, click "Queries" in the list of Objects, and double-click "Create query by using Wizard". Click the double arrows (>>) to pick every field from the first table. Then under Tables/Queries pick the second table and use the double arrows again. You should see a list of every field from both tables on the right. Then click Next.

10.Leave the default "Detail" view because you'll fix it up in Excel. For now we just want the query to show the raw data. Note that tweaking such queries allows you to filter or combine data in many clever ways. For now we want everything. Click Next, then Finish (default name). The results of the query should appear in an Excel-like window.

11.From within the query results window, click File -> Export, select the target directory (usually the same as the sources), pick Excel 97-2000 from the "Save as type" drop-down list, and give the new file a sensible, short name reflecting the contents of this process. I named the results of this one BIOM_ORDER_Combined_Query.XLS.

12.Close Access 2000, open Excel 2000, and open the new file you made.

13.Create a new first column, move the "Key" contents there, and delete the empty column you left behind plus any (Access-generated) "ID" column.

14.From the menus click Data -> Sort -> and select a few useful criteria. In this case I chose "key" then "internal_note" and then "divnote". The users could do this, but it saves them some trouble. Now save the file.

15.Copy this modified Excel file to wherever the users expect to find the data, in this case O:\biomed\aleph\NOTIS Reports\06262002. This directory name is a reminder of when I started working with the source data. A new directory with new data may be created later this summer.

The SQL statement below, hidden within the wizard-generated query, is a VERY simple example of the power of Structured Query Language. With a few tweaks, we could limit the results by any number of criteria.

Even more impressive to some systems admins is how the SQL client and server can be so cleanly divided (client/server), so systems using pure SQL over TCP/IP with no direct file access are generally more secure, faster, and more reliable than any older low-end systems like Paradox. Properly implemented, a small SQL system can even work by modem.

Access 2000 can work as either a low-end traditional database or as a powerful client in a true client/server environment, with SQL Server or any other supported SQL database at the server end. For this reason alone I think it's a wonderful learning tool compared to other low-end databases.

-Brad

*** SQL from that query ***

SELECT [BIOM ORDER EXTRACT].title, [BIOM ORDER EXTRACT].ordunit, [BIOM ORDER EXTRACT].scope, [BIOM ORDER EXTRACT].vendcode, [BIOM ORDER EXTRACT].action, [BIOM ORDER EXTRACT].vendnote, [BIOM ORDER EXTRACT].internal_note, [BIOM ORDER EXTRACT].divnote, [BIOM ORDER NOTES EXTRACT].ID, [BIOM ORDER NOTES EXTRACT].key, [BIOM ORDER NOTES EXTRACT].statement, [BIOM ORDER NOTES EXTRACT].type, [BIOM ORDER NOTES EXTRACT].[M Date], [BIOM ORDER NOTES EXTRACT].[A Date] FROM [BIOM ORDER EXTRACT] INNER JOIN [BIOM ORDER NOTES EXTRACT] ON [BIOM ORDER EXTRACT].key = [BIOM ORDER NOTES EXTRACT].key;

Posted by tapli005 at May 16, 2005 9:29 AM