From: David Evans on
Dear Andrey

If the records are absolutely identical eliminating duplicates is easy
* Append one file to the end of the other file
* Sort the combined file - duplicate records will then be one
after the other
* Use a program to write a new version of the combined file which
checks to see if any consecutive records are the same and does not
include the second of any identical pair.
* I do this frequently - could do it for you.

If the records are not absolutely identical then solving the problem
is not so easy.

How many words are in each dictionary ?
How large are the 2 files ?

Kind regards

David

>I made Russian medical dictionary for open office, but can use it together
>with main Russian dictionary. I decide to merge two dictionaries in one, but
>don't know how :(
>
>Is there an instrument, that merge two dictionaries in one and removes
>duplicates?
>--
>Andrey Yurkovsky
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: discuss-unsubscribe(a)openoffice.org
>For additional commands, e-mail: discuss-help(a)openoffice.org
From: Andrey Yurkovsky on
Thnx for replay.

Sergey Kurakin already help me with his script:

#!/bin/bash

# Combines two or more hunspell dictionaries.
# (C) 2010 Sergey Kurakin <kurakin_at_altlinux_dot_org>

# Attention! All source dictionaries MUST share the same affix file.

# Usage: dic_combine source1.dic source2.dic [source3.dic...] > combined.dic

TEMPFILE=`mktemp`

cat $@ | sort --unique | sed -r 's|^[0123456789]*$||;/^$/d' > $TEMPFILE

cat $TEMPFILE | wc -l
cat $TEMPFILE
rm -f $TEMPFILE

--
Andrey Yurkovsky


---------------------------------------------------------------------
To unsubscribe, e-mail: discuss-unsubscribe(a)openoffice.org
For additional commands, e-mail: discuss-help(a)openoffice.org

From: Harold Fuchs on
Andrey Yurkovsky wrote:
> I made Russian medical dictionary for open office, but can use it together
> with main Russian dictionary. I decide to merge two dictionaries in one, but
> don't know how :(
>
> Is there an instrument, that merge two dictionaries in one and removes
> duplicates?
>
I think you should ask this question in
dev(a)lingucomponent.openoffice.org which is where language- &
dictionary-related issues are discussed. Note that you will need to
subscribe to that list before posting to it. See
http://lingucomponent.openoffice.org/ for details - you may even find
the solution to your problem on that page/linked-pages too.

--
Harold Fuchs
London, England


---------------------------------------------------------------------
To unsubscribe, e-mail: discuss-unsubscribe(a)openoffice.org
For additional commands, e-mail: discuss-help(a)openoffice.org