From: David Evans on 22 Mar 2010 07:27 Dear Andrey If the records are absolutely identical eliminating duplicates is easy * Append one file to the end of the other file * Sort the combined file - duplicate records will then be one after the other * Use a program to write a new version of the combined file which checks to see if any consecutive records are the same and does not include the second of any identical pair. * I do this frequently - could do it for you. If the records are not absolutely identical then solving the problem is not so easy. How many words are in each dictionary ? How large are the 2 files ? Kind regards David >I made Russian medical dictionary for open office, but can use it together >with main Russian dictionary. I decide to merge two dictionaries in one, but >don't know how :( > >Is there an instrument, that merge two dictionaries in one and removes >duplicates? >-- >Andrey Yurkovsky > > >--------------------------------------------------------------------- >To unsubscribe, e-mail: discuss-unsubscribe(a)openoffice.org >For additional commands, e-mail: discuss-help(a)openoffice.org
From: Andrey Yurkovsky on 22 Mar 2010 10:55 Thnx for replay. Sergey Kurakin already help me with his script: #!/bin/bash # Combines two or more hunspell dictionaries. # (C) 2010 Sergey Kurakin <kurakin_at_altlinux_dot_org> # Attention! All source dictionaries MUST share the same affix file. # Usage: dic_combine source1.dic source2.dic [source3.dic...] > combined.dic TEMPFILE=`mktemp` cat $@ | sort --unique | sed -r 's|^[0123456789]*$||;/^$/d' > $TEMPFILE cat $TEMPFILE | wc -l cat $TEMPFILE rm -f $TEMPFILE -- Andrey Yurkovsky --------------------------------------------------------------------- To unsubscribe, e-mail: discuss-unsubscribe(a)openoffice.org For additional commands, e-mail: discuss-help(a)openoffice.org
From: Harold Fuchs on 21 Mar 2010 20:28 Andrey Yurkovsky wrote: > I made Russian medical dictionary for open office, but can use it together > with main Russian dictionary. I decide to merge two dictionaries in one, but > don't know how :( > > Is there an instrument, that merge two dictionaries in one and removes > duplicates? > I think you should ask this question in dev(a)lingucomponent.openoffice.org which is where language- & dictionary-related issues are discussed. Note that you will need to subscribe to that list before posting to it. See http://lingucomponent.openoffice.org/ for details - you may even find the solution to your problem on that page/linked-pages too. -- Harold Fuchs London, England --------------------------------------------------------------------- To unsubscribe, e-mail: discuss-unsubscribe(a)openoffice.org For additional commands, e-mail: discuss-help(a)openoffice.org
|
Pages: 1 Prev: [discuss] Merge hunspell dictionaries Next: OpenOffice deletes hyperlinks |