From: analyst41 on
I need to keep in memory an array of around 60000 character variables,
each element of which can have a max length of 4000 byres. But if you
add up the lengths of all the actual data values, it is only 1/8 of
60000*4000.

What would be the cleanest way to store this data to take advantage of
this fact?

Thanks.
From: Gordon Sande on
On 2010-04-21 19:22:06 -0300, "analyst41(a)hotmail.com"
<analyst41(a)hotmail.com> said:

> I need to keep in memory an array of around 60000 character variables,
> each element of which can have a max length of 4000 byres. But if you
> add up the lengths of all the actual data values, it is only 1/8 of
> 60000*4000.
>
> What would be the cleanest way to store this data to take advantage of
> this fact?
>
> Thanks.

For a symbol table where the average length is short but a few symbols
can be much longer I have used pointers (an integer valued subscript)
into a storage pool. You can either be like Pascal and have a length
associated with each pointer or like C and use a sentinel to end each
string (C uses a nul but ASCII does have ETX for end-of-text). Depends
on how important it is to easily know the length. Just another example
of "make sure you know all the operations to be done before you decide
on the data structure". The pointers, often called headers, will be
integers but the storage pool will characters. I tend to use an array
of characters of length one but one might use a longer character and
find substrings. The issue is converting from a character of length
n to and from an array of n charaacters of lenth one.

The next layer out is to decide whether this is just a storage pool
where you know which pointer to use or is some sort of searchable
data structure which has to determine which poiter does whatever job
is expected of it. But this is not what you asked about.

Storage is so easy and cheap that sometimes I don't bother any more.

Your mileage may vary, as is the usual weaseling out.




From: e p chandler on

<analyst41(a)hotmail.com> wrote in message
news:24f1b2ab-86b6-4518-932b-fd48f901ead4(a)w16g2000vbf.googlegroups.com...
>I need to keep in memory an array of around 60000 character variables,
> each element of which can have a max length of 4000 byres. But if you
> add up the lengths of all the actual data values, it is only 1/8 of
> 60000*4000.
>
> What would be the cleanest way to store this data to take advantage of
> this fact?
>
> Thanks.

I'll skip the usual rhetorical why?, what_for? and what's wrong with a
database?

How about an array each of whose elements is a derived type. One component
is an integer that specifies the string's length. The second is an
allocatable array of single characters.

---- start text ----
module my_mod
implicit none

type node
integer :: len
character, allocatable :: char(:)
end type
end module my_mod

program my_prog
use my_mod
implicit none
integer,parameter :: buff_size = 80, node_max = 5
character(buff_size) :: in_buff
integer :: curr_node,curr_len,curr_pos,num_nodes
type(node) :: str(node_max)

num_nodes = 0
do curr_node = 1,node_max
read '(a)',in_buff
curr_len = len(trim(in_buff))
if (curr_len == 0) exit
str(curr_node)%len = curr_len
allocate(str(curr_node)%char(curr_len))
do curr_pos = 1,curr_len
str(curr_node)%char(curr_pos) = in_buff(curr_pos:curr_pos)
end do
num_nodes = num_nodes + 1
end do

do curr_node = 1,num_nodes
print *,curr_node,str(curr_node)%len,'|',str(curr_node)%char,'|'
end do

end program my_prog
---- end text ----

Of course then you have the overhead of converting to and from or
inter-operating with normal string variables, etc.

---- e


From: Jim Xia on

> I'll skip the usual rhetorical why?, what_for? and what's wrong with a
> database?
>
> How about an array each of whose elements is a derived type. One component
> is an integer that specifies the string's length. The second is an
> allocatable array of single characters.
>
> ---- start text ----
> module my_mod
> implicit none
>
> type node
>   integer :: len
>   character, allocatable :: char(:)
> end type
> end module my_mod
>


How about this

type string
character(:), allocatable :: str
end type


You don't need the len variable. len(node%str) will be that value.


Then you need an array of type string: type(string) ::
my_memory(60000)

This method saves you space only if the majority of the string lengths
are far less than 4000 bytes. Otherwise the Fortran descriptor will
take a toll on your total memory usage.

Cheers,

Jim
From: Jim Xia on
OK, for the sake of the completeness, let's see my version of my_prog:




module my_mod
implicit none

type string
character(:), allocatable :: str
end type
end module my_mod


program my_prog
use my_mod
implicit none
integer,parameter :: max_buff_size = 4000
integer,parameter :: array_size = 60000
character(max_buff_size) :: in_buff

type(string) :: the_memory(60000)
integer i, whatEverUnit

do i = 1, array_size
read (whatEverUnit, '(a)') in_buff
the_memory(i)%str = trim(in_buff)
end do

end program my_prog



Cheers,

Jim