From: analyst41 on 21 Apr 2010 18:22 I need to keep in memory an array of around 60000 character variables, each element of which can have a max length of 4000 byres. But if you add up the lengths of all the actual data values, it is only 1/8 of 60000*4000. What would be the cleanest way to store this data to take advantage of this fact? Thanks.
From: Gordon Sande on 21 Apr 2010 20:47 On 2010-04-21 19:22:06 -0300, "analyst41(a)hotmail.com" <analyst41(a)hotmail.com> said: > I need to keep in memory an array of around 60000 character variables, > each element of which can have a max length of 4000 byres. But if you > add up the lengths of all the actual data values, it is only 1/8 of > 60000*4000. > > What would be the cleanest way to store this data to take advantage of > this fact? > > Thanks. For a symbol table where the average length is short but a few symbols can be much longer I have used pointers (an integer valued subscript) into a storage pool. You can either be like Pascal and have a length associated with each pointer or like C and use a sentinel to end each string (C uses a nul but ASCII does have ETX for end-of-text). Depends on how important it is to easily know the length. Just another example of "make sure you know all the operations to be done before you decide on the data structure". The pointers, often called headers, will be integers but the storage pool will characters. I tend to use an array of characters of length one but one might use a longer character and find substrings. The issue is converting from a character of length n to and from an array of n charaacters of lenth one. The next layer out is to decide whether this is just a storage pool where you know which pointer to use or is some sort of searchable data structure which has to determine which poiter does whatever job is expected of it. But this is not what you asked about. Storage is so easy and cheap that sometimes I don't bother any more. Your mileage may vary, as is the usual weaseling out.
From: e p chandler on 21 Apr 2010 20:57 <analyst41(a)hotmail.com> wrote in message news:24f1b2ab-86b6-4518-932b-fd48f901ead4(a)w16g2000vbf.googlegroups.com... >I need to keep in memory an array of around 60000 character variables, > each element of which can have a max length of 4000 byres. But if you > add up the lengths of all the actual data values, it is only 1/8 of > 60000*4000. > > What would be the cleanest way to store this data to take advantage of > this fact? > > Thanks. I'll skip the usual rhetorical why?, what_for? and what's wrong with a database? How about an array each of whose elements is a derived type. One component is an integer that specifies the string's length. The second is an allocatable array of single characters. ---- start text ---- module my_mod implicit none type node integer :: len character, allocatable :: char(:) end type end module my_mod program my_prog use my_mod implicit none integer,parameter :: buff_size = 80, node_max = 5 character(buff_size) :: in_buff integer :: curr_node,curr_len,curr_pos,num_nodes type(node) :: str(node_max) num_nodes = 0 do curr_node = 1,node_max read '(a)',in_buff curr_len = len(trim(in_buff)) if (curr_len == 0) exit str(curr_node)%len = curr_len allocate(str(curr_node)%char(curr_len)) do curr_pos = 1,curr_len str(curr_node)%char(curr_pos) = in_buff(curr_pos:curr_pos) end do num_nodes = num_nodes + 1 end do do curr_node = 1,num_nodes print *,curr_node,str(curr_node)%len,'|',str(curr_node)%char,'|' end do end program my_prog ---- end text ---- Of course then you have the overhead of converting to and from or inter-operating with normal string variables, etc. ---- e
From: Jim Xia on 21 Apr 2010 21:20 > I'll skip the usual rhetorical why?, what_for? and what's wrong with a > database? > > How about an array each of whose elements is a derived type. One component > is an integer that specifies the string's length. The second is an > allocatable array of single characters. > > ---- start text ---- > module my_mod > implicit none > > type node > integer :: len > character, allocatable :: char(:) > end type > end module my_mod > How about this type string character(:), allocatable :: str end type You don't need the len variable. len(node%str) will be that value. Then you need an array of type string: type(string) :: my_memory(60000) This method saves you space only if the majority of the string lengths are far less than 4000 bytes. Otherwise the Fortran descriptor will take a toll on your total memory usage. Cheers, Jim
From: Jim Xia on 21 Apr 2010 21:32
OK, for the sake of the completeness, let's see my version of my_prog: module my_mod implicit none type string character(:), allocatable :: str end type end module my_mod program my_prog use my_mod implicit none integer,parameter :: max_buff_size = 4000 integer,parameter :: array_size = 60000 character(max_buff_size) :: in_buff type(string) :: the_memory(60000) integer i, whatEverUnit do i = 1, array_size read (whatEverUnit, '(a)') in_buff the_memory(i)%str = trim(in_buff) end do end program my_prog Cheers, Jim |