From: GB on
I have a *large* amount of data that I need to sort and have available on
disk for easy access. It's about 20 million items, and about a gigabyte. I
thought that probably a tree is the way to go. I imagine this has already
been programmed many, many times in VBA, so can somebody kindly point me to
some code please?



--
Murphy's ultimate law is that if something that could go wrong doesn't, it
turns out that it would have been better if it had gone wrong.


From: "Tony Jollans" My forename at my surname dot on
When you have your data sorted, how do you plan to access it?

I'm sure that Sorts have been programmed many times, but whether Word is a
sensible tool for the job, I somehow doubt. Why do you want to do it in VBA?

--
Enjoy,
Tony

www.WordArticles.com

"GB" <NOTsomeone(a)microsoft.com> wrote in message
news:4bc1e81c$0$2485$db0fefd9(a)news.zen.co.uk...
>I have a *large* amount of data that I need to sort and have available on
>disk for easy access. It's about 20 million items, and about a gigabyte. I
>thought that probably a tree is the way to go. I imagine this has already
>been programmed many, many times in VBA, so can somebody kindly point me to
>some code please?
>
>
>
> --
> Murphy's ultimate law is that if something that could go wrong doesn't,
> it turns out that it would have been better if it had gone wrong.
>

From: GB on
Tony Jollans wrote:
> When you have your data sorted, how do you plan to access it?
>
> I'm sure that Sorts have been programmed many times, but whether Word
> is a sensible tool for the job, I somehow doubt. Why do you want to
> do it in VBA?

Well, I'll need to access the data from within word, so I thought that
setting up the database within word made good sense, so I do the job once
not twice.

Why do you think that VBA/Word is not up to the job? I'd rather learn that
now than later.




>
> "GB" <NOTsomeone(a)microsoft.com> wrote in message
> news:4bc1e81c$0$2485$db0fefd9(a)news.zen.co.uk...
>> I have a *large* amount of data that I need to sort and have
>> available on disk for easy access. It's about 20 million items, and
>> about a gigabyte. I thought that probably a tree is the way to go. I
>> imagine this has already been programmed many, many times in VBA, so
>> can somebody kindly point me to some code please?
>>
>>
>>
>> --
>> Murphy's ultimate law is that if something that could go wrong
>> doesn't, it turns out that it would have been better if it had gone
>> wrong.

--
Murphy's ultimate law is that if something that could go wrong doesn't,
it turns out that it would have been better if it had gone wrong.


From: "Tony Jollans" My forename at my surname dot on
Word is not a database tool. Small amounts of data can be dealt with withn
it, but 1GB will exceed the maximum document size, and, even if it didn't,
performance would likely be dreadful, so, even if you want to access it from
Word, you will have to hold it somewhere else.

Strictly, how and where you sort your data, and where you ultimately hold it
are not necessarily related but if, for example, you were going to set up an
Access database, it would be better to do the work in Access.

As for Word VBA not being up to the job (of sorting), I think memory would
likely be the biggest concern - dedicated sort software would handle that
kind of thing automatically, in VBA you'd (probably) have to code up lots of
logic to have overflow storage somewhere on disk. Mechanically, WordBasic
Sort _may_ be able to do it but, again, the volume of data and the memory
required would be a concern.

At a guess you would need a minimum of 2GB (and very probably more, although
I'm not sure VBA can address more than that) of available RAM, which
probably means you would need to be running a 64-bit system - maybe, if you
have the hardware, you could give it a shot. Just try reading your data into
an array - if you can get that far, the sort might be worth a shot, but you
still have the issue of where you are going to save the sorted data, and
what way you are going to access it when it has been sorted. Using the
database as an example, again, you may not need to do the sorting yourself -
a database load function might be able to do it all behind the scenes.

--
Enjoy,
Tony

www.WordArticles.com

"GB" <NOTsomeone(a)microsoft.com> wrote in message
news:4bc213da$0$2476$db0fefd9(a)news.zen.co.uk...
> Tony Jollans wrote:
>> When you have your data sorted, how do you plan to access it?
>>
>> I'm sure that Sorts have been programmed many times, but whether Word
>> is a sensible tool for the job, I somehow doubt. Why do you want to
>> do it in VBA?
>
> Well, I'll need to access the data from within word, so I thought that
> setting up the database within word made good sense, so I do the job once
> not twice.
>
> Why do you think that VBA/Word is not up to the job? I'd rather learn that
> now than later.
>
>
>
>
>>
>> "GB" <NOTsomeone(a)microsoft.com> wrote in message
>> news:4bc1e81c$0$2485$db0fefd9(a)news.zen.co.uk...
>>> I have a *large* amount of data that I need to sort and have
>>> available on disk for easy access. It's about 20 million items, and
>>> about a gigabyte. I thought that probably a tree is the way to go. I
>>> imagine this has already been programmed many, many times in VBA, so
>>> can somebody kindly point me to some code please?
>>>
>>>
>>>
>>> --
>>> Murphy's ultimate law is that if something that could go wrong
>>> doesn't, it turns out that it would have been better if it had gone
>>> wrong.
>
> --
> Murphy's ultimate law is that if something that could go wrong doesn't,
> it turns out that it would have been better if it had gone wrong.
>

From: GB on
Tony Jollans wrote:
> Word is not a database tool. Small amounts of data can be dealt with
> withn it, but 1GB will exceed the maximum document size, and, even if
> it didn't, performance would likely be dreadful, so, even if you want
> to access it from Word, you will have to hold it somewhere else.

The way a tree sort works is that you set up a large random access file on
disk. You add the records one at a time, with pointers (ie an additional
field with the record number) to the next record. As you add more records,
you just change the pointers in the 'adjacent' records accordingly. So, Word
only needs to have enough space for 3 records in memory at a time.

I just thought that, rather than programming this myself, I would see if
somebody else has already done this and tested it?

When you said it wouldn't work, I was worried that you meant there are
restrictions on random access file sizes that vba can handle - something
like that. There may be of course, but you didn't mention this.