From: cardan on 5 May 2010 20:33 Hello, I have a large database issue I hope someone can help me with. It is hard to explain so I have tried to be as descriptive as possible without losing sight of my problem. OVERALL QUESTION Is there a way to have my Excel workbook search a very large text database outside the Excel file? I am working with an extremely large data set and my Excel file is starting to get too large (20M) that I need to figure out a way to reduce file size (and maybe increase functionality and flexibility). QUICK BACKGROUND Every bank in the US is required to report their financial statements on a quarterly basis called Call Reports. This information is available free and online on the FDIC website. There are about 7,500 banks nationally that report. These reports contain maybe 1,000 different numbers, each given its own code (i.e. Total Assets is RCON2170- Ill call them RCON #s for short) and every bank uses the same template. I am able to download this data in bulk form into a zipped text folder which I then extract. This folder will now contains about 40 different text files each named for a section in the report (ie. RCCI is the balance sheet section). The Banks unique identifier is listed in column A and the number code (RCON) is listed in row 1. (Each bank identifier and number code are unique). The whole unzipped folder is approximately 65M. CURRENT SETUP Of these 40 sections mentioned, I need about 6 of the sections in my Excel file. Right now I convert the sections I need into an Excel format (Convert with a Tab Delimited) and put it into my model. I have template tabs that uses the INDEX-MATCH formulas to find the right number based on the RCON code and the banks unique identifier. This setup allows for comparisons amongst different banks. All the user has to do is input the banks unique identifier and it will return the appropriate numbers based on the RCON number. (Sometimes we will do 10 banks side by side). The INDEX MATCH works very well. THE PROBLEM The issue is that each tab represents a section of the report. Each section lists each bank (7,500) and contains approximately 100 RCON numbers (a data set of 750,000 fields per tab). My file for a quarterly report is approximately 20M. Is there a way to have my Excel workbook search the text files for the appropriate RCON number and the banks unique identifier? That way I can keep my files limited in size and may be able to include previous Call Reports for Trend Analysis. Any help would be extremely appreciated. (Also let me know if this format is acceptable to understand my issues :)
From: JLatham on 5 May 2010 22:38 Short answer: yes. VBA can deal with this type of thing pretty well, and actually with pretty good speed. The process would go something like this: You enter the RCON number into a cell or as a response to an InputBox$() statement. Then the code would open the text files and look for the RCON number in each row of input from them, and as the RCON is encountered, would then extract the information you need from the row and it into the Excel sheet. In order to accomplish this, a very good analysis/understanding of the content and format of the lines of the text files is required. You usually have to write custom code to "parse" the inut data from the text file to get just what you want from it and pull it into Excel. As far as whether or not this format is acceptable to understand your issues, I think so - at least I think I understand your needs. But I could be wrong - that's been known to happen from time to time (usually with rather short intervals between the misunderstandings). "cardan" wrote: > Hello, I have a large database issue I hope someone can help me with. > It is hard to explain so I have tried to be as descriptive as possible > without losing sight of my problem. > > OVERALL QUESTION > Is there a way to have my Excel workbook search a very large text > database outside the Excel file? > I am working with an extremely large data set and my Excel file is > starting to get too large (20M) that I need to figure out a way to > reduce file size (and maybe increase functionality and flexibility). > > QUICK BACKGROUND > Every bank in the US is required to report their financial statements > on a quarterly basis called Call Reports. This information is > available free and online on the FDIC website. There are about 7,500 > banks nationally that report. These reports contain maybe 1,000 > different numbers, each given its own code (i.e. Total Assets is > RCON2170- I'll call them RCON #'s for short) and every bank uses the > same template. > > I am able to download this data in bulk form into a zipped text folder > which I then extract. This folder will now contains about 40 > different text files – each named for a section in the report (ie. > RCCI is the balance sheet section). The Banks unique identifier is > listed in column A and the number code (RCON) is listed in row 1. > (Each bank identifier and number code are unique). The whole unzipped > folder is approximately 65M. > > CURRENT SETUP > Of these 40 sections mentioned, I need about 6 of the sections in my > Excel file. Right now I convert the sections I need into an Excel > format (Convert with a Tab Delimited) and put it into my model. I have > template tabs that uses the INDEX-MATCH formulas to find the right > number based on the RCON code and the banks unique identifier. This > setup allows for comparisons amongst different banks. All the user has > to do is input the banks unique identifier and it will return the > appropriate numbers based on the RCON number. (Sometimes we will do 10 > banks side by side). The INDEX MATCH works very well. > > THE PROBLEM > The issue is that each tab represents a section of the report. Each > section lists each bank (7,500) and contains approximately 100 RCON > numbers (a data set of 750,000 fields per tab). My file for a > quarterly report is approximately 20M. Is there a way to have my > Excel workbook search the text files for the appropriate RCON number > and the banks unique identifier? That way I can keep my files limited > in size and may be able to include previous Call Reports for Trend > Analysis. > > Any help would be extremely appreciated. (Also let me know if this > format is acceptable to understand my issues :) > > > . >
From: Gary Keramidas on 5 May 2010 22:49 i actually automated a specific call report for a credit union. it was a report that they needed to fill out. they would retrieve the gl income and balance data from their server and then i would populate the report with the correct amounts. doesn't really help. just thought i'd mention it since you mentioned a call report. -- Gary Keramidas Excel 2003 "cardan" <carlsondaniel(a)gmail.com> wrote in message news:068115b3-ae8b-4c77-8f42-5f374bd970e0(a)d19g2000yqf.googlegroups.com... Hello, I have a large database issue I hope someone can help me with. It is hard to explain so I have tried to be as descriptive as possible without losing sight of my problem. OVERALL QUESTION Is there a way to have my Excel workbook search a very large text database outside the Excel file? I am working with an extremely large data set and my Excel file is starting to get too large (20M) that I need to figure out a way to reduce file size (and maybe increase functionality and flexibility). QUICK BACKGROUND Every bank in the US is required to report their financial statements on a quarterly basis called Call Reports. This information is available free and online on the FDIC website. There are about 7,500 banks nationally that report. These reports contain maybe 1,000 different numbers, each given its own code (i.e. Total Assets is RCON2170- I�ll call them RCON #�s for short) and every bank uses the same template. I am able to download this data in bulk form into a zipped text folder which I then extract. This folder will now contains about 40 different text files � each named for a section in the report (ie. RCCI is the balance sheet section). The Banks unique identifier is listed in column A and the number code (RCON) is listed in row 1. (Each bank identifier and number code are unique). The whole unzipped folder is approximately 65M. CURRENT SETUP Of these 40 sections mentioned, I need about 6 of the sections in my Excel file. Right now I convert the sections I need into an Excel format (Convert with a Tab Delimited) and put it into my model. I have template tabs that uses the INDEX-MATCH formulas to find the right number based on the RCON code and the banks unique identifier. This setup allows for comparisons amongst different banks. All the user has to do is input the banks unique identifier and it will return the appropriate numbers based on the RCON number. (Sometimes we will do 10 banks side by side). The INDEX MATCH works very well. THE PROBLEM The issue is that each tab represents a section of the report. Each section lists each bank (7,500) and contains approximately 100 RCON numbers (a data set of 750,000 fields per tab). My file for a quarterly report is approximately 20M. Is there a way to have my Excel workbook search the text files for the appropriate RCON number and the banks unique identifier? That way I can keep my files limited in size and may be able to include previous Call Reports for Trend Analysis. Any help would be extremely appreciated. (Also let me know if this format is acceptable to understand my issues :)
|
Pages: 1 Prev: Excel 2003 and Gantt Chart Next: Possible disregard formatting in exception match code? |