Prev: How do I extract text from drop down entry in Word 2003
Next: Intercept word's built-in command "Search and Replace (with its tabs "search, replace, GoTo)"
From: GB on 11 Apr 2010 11:17 I have a *large* amount of data that I need to sort and have available on disk for easy access. It's about 20 million items, and about a gigabyte. I thought that probably a tree is the way to go. I imagine this has already been programmed many, many times in VBA, so can somebody kindly point me to some code please? -- Murphy's ultimate law is that if something that could go wrong doesn't, it turns out that it would have been better if it had gone wrong.
From: "Tony Jollans" My forename at my surname dot on 11 Apr 2010 11:42 When you have your data sorted, how do you plan to access it? I'm sure that Sorts have been programmed many times, but whether Word is a sensible tool for the job, I somehow doubt. Why do you want to do it in VBA? -- Enjoy, Tony www.WordArticles.com "GB" <NOTsomeone(a)microsoft.com> wrote in message news:4bc1e81c$0$2485$db0fefd9(a)news.zen.co.uk... >I have a *large* amount of data that I need to sort and have available on >disk for easy access. It's about 20 million items, and about a gigabyte. I >thought that probably a tree is the way to go. I imagine this has already >been programmed many, many times in VBA, so can somebody kindly point me to >some code please? > > > > -- > Murphy's ultimate law is that if something that could go wrong doesn't, > it turns out that it would have been better if it had gone wrong. >
From: GB on 11 Apr 2010 14:24 Tony Jollans wrote: > When you have your data sorted, how do you plan to access it? > > I'm sure that Sorts have been programmed many times, but whether Word > is a sensible tool for the job, I somehow doubt. Why do you want to > do it in VBA? Well, I'll need to access the data from within word, so I thought that setting up the database within word made good sense, so I do the job once not twice. Why do you think that VBA/Word is not up to the job? I'd rather learn that now than later. > > "GB" <NOTsomeone(a)microsoft.com> wrote in message > news:4bc1e81c$0$2485$db0fefd9(a)news.zen.co.uk... >> I have a *large* amount of data that I need to sort and have >> available on disk for easy access. It's about 20 million items, and >> about a gigabyte. I thought that probably a tree is the way to go. I >> imagine this has already been programmed many, many times in VBA, so >> can somebody kindly point me to some code please? >> >> >> >> -- >> Murphy's ultimate law is that if something that could go wrong >> doesn't, it turns out that it would have been better if it had gone >> wrong. -- Murphy's ultimate law is that if something that could go wrong doesn't, it turns out that it would have been better if it had gone wrong.
From: "Tony Jollans" My forename at my surname dot on 11 Apr 2010 15:12 Word is not a database tool. Small amounts of data can be dealt with withn it, but 1GB will exceed the maximum document size, and, even if it didn't, performance would likely be dreadful, so, even if you want to access it from Word, you will have to hold it somewhere else. Strictly, how and where you sort your data, and where you ultimately hold it are not necessarily related but if, for example, you were going to set up an Access database, it would be better to do the work in Access. As for Word VBA not being up to the job (of sorting), I think memory would likely be the biggest concern - dedicated sort software would handle that kind of thing automatically, in VBA you'd (probably) have to code up lots of logic to have overflow storage somewhere on disk. Mechanically, WordBasic Sort _may_ be able to do it but, again, the volume of data and the memory required would be a concern. At a guess you would need a minimum of 2GB (and very probably more, although I'm not sure VBA can address more than that) of available RAM, which probably means you would need to be running a 64-bit system - maybe, if you have the hardware, you could give it a shot. Just try reading your data into an array - if you can get that far, the sort might be worth a shot, but you still have the issue of where you are going to save the sorted data, and what way you are going to access it when it has been sorted. Using the database as an example, again, you may not need to do the sorting yourself - a database load function might be able to do it all behind the scenes. -- Enjoy, Tony www.WordArticles.com "GB" <NOTsomeone(a)microsoft.com> wrote in message news:4bc213da$0$2476$db0fefd9(a)news.zen.co.uk... > Tony Jollans wrote: >> When you have your data sorted, how do you plan to access it? >> >> I'm sure that Sorts have been programmed many times, but whether Word >> is a sensible tool for the job, I somehow doubt. Why do you want to >> do it in VBA? > > Well, I'll need to access the data from within word, so I thought that > setting up the database within word made good sense, so I do the job once > not twice. > > Why do you think that VBA/Word is not up to the job? I'd rather learn that > now than later. > > > > >> >> "GB" <NOTsomeone(a)microsoft.com> wrote in message >> news:4bc1e81c$0$2485$db0fefd9(a)news.zen.co.uk... >>> I have a *large* amount of data that I need to sort and have >>> available on disk for easy access. It's about 20 million items, and >>> about a gigabyte. I thought that probably a tree is the way to go. I >>> imagine this has already been programmed many, many times in VBA, so >>> can somebody kindly point me to some code please? >>> >>> >>> >>> -- >>> Murphy's ultimate law is that if something that could go wrong >>> doesn't, it turns out that it would have been better if it had gone >>> wrong. > > -- > Murphy's ultimate law is that if something that could go wrong doesn't, > it turns out that it would have been better if it had gone wrong. >
From: GB on 11 Apr 2010 15:36
Tony Jollans wrote: > Word is not a database tool. Small amounts of data can be dealt with > withn it, but 1GB will exceed the maximum document size, and, even if > it didn't, performance would likely be dreadful, so, even if you want > to access it from Word, you will have to hold it somewhere else. The way a tree sort works is that you set up a large random access file on disk. You add the records one at a time, with pointers (ie an additional field with the record number) to the next record. As you add more records, you just change the pointers in the 'adjacent' records accordingly. So, Word only needs to have enough space for 3 records in memory at a time. I just thought that, rather than programming this myself, I would see if somebody else has already done this and tested it? When you said it wouldn't work, I was worried that you meant there are restrictions on random access file sizes that vba can handle - something like that. There may be of course, but you didn't mention this. |