From: Ahmed Bedeer Ahmed on
Dear All,

I'm working on a project where I need to extract text information from some
word documents. I need to know each character(range)'s page number, line
number and text-column number.

I use Microsoft Office Interop for Word assembly in C#.

I could find the page number, and line number easily using the
range.get_Information() method, but I could not find the information of
text-column number.

Note: I mean the text-column NOT the column number in a table.

and here is a portion of the code:

//
// After the document is opened using app.Documents.Open()
//

foreach (Microsoft.Office.Interop.Word.Range range in document.Words)
{
string text = range.Text;

lineNo =
(int)range.get_Information(Microsoft.Office.Interop.Word.WdInformation.wdFirstCharacterLineNumber);
pageNo =
(int)range.get_Information(Microsoft.Office.Interop.Word.WdInformation.wdActiveEndPageNumber);

float lineY =
(float)range.Characters.Last.get_Information(Microsoft.Office.Interop.Word.WdInformation.wdVerticalPositionRelativeToPage);
float wordX =
(float)range.get_Information(Microsoft.Office.Interop.Word.WdInformation.wdHorizontalPositionRelativeToPage);

//.. Here I need to know the text-column number of the current range.
//
}

please help,

Thanks in advance
From: Graham Mayor on
Text column information would seem to be a bit pointless with proportionally
spaced fonts. It is essentially the character count from the start of the
current 'line' to the cursor position + 1. Problem is 'line' is not a
parameter of the current range. What constitutes a 'line' in the document
you are evaluating.
e.g. if each 'line' is a paragraph, you could move the start of the range
from the cursor position to the start of the paragraph and measure the
length of the range. The following is a vba equivalent of that

Dim oRng As Range
Dim iCol As Integer
Set oRng = Selection.Range
oRng.Start = oRng.Paragraphs(1).Range.Start
iCol = Len(oRng) + 1


--
<>>< ><<> ><<> <>>< ><<> <>>< <>><<>
Graham Mayor - Word MVP

My web site www.gmayor.com
Word MVP web site http://word.mvps.org
<>>< ><<> ><<> <>>< ><<> <>>< <>><<>



"Ahmed Bedeer" <Ahmed Bedeer(a)discussions.microsoft.com> wrote in message
news:CE332D94-3253-4389-BC56-E4AC4D48FF0D(a)microsoft.com...
> Dear All,
>
> I'm working on a project where I need to extract text information from
> some
> word documents. I need to know each character(range)'s page number, line
> number and text-column number.
>
> I use Microsoft Office Interop for Word assembly in C#.
>
> I could find the page number, and line number easily using the
> range.get_Information() method, but I could not find the information of
> text-column number.
>
> Note: I mean the text-column NOT the column number in a table.
>
> and here is a portion of the code:
>
> //
> // After the document is opened using app.Documents.Open()
> //
>
> foreach (Microsoft.Office.Interop.Word.Range range in document.Words)
> {
> string text = range.Text;
>
> lineNo =
> (int)range.get_Information(Microsoft.Office.Interop.Word.WdInformation.wdFirstCharacterLineNumber);
> pageNo =
> (int)range.get_Information(Microsoft.Office.Interop.Word.WdInformation.wdActiveEndPageNumber);
>
> float lineY =
> (float)range.Characters.Last.get_Information(Microsoft.Office.Interop.Word.WdInformation.wdVerticalPositionRelativeToPage);
> float wordX =
> (float)range.get_Information(Microsoft.Office.Interop.Word.WdInformation.wdHorizontalPositionRelativeToPage);
>
> //.. Here I need to know the text-column number of the current range.
> //
> }
>
> please help,
>
> Thanks in advance


From: Ahmed Bedeer on
Hi Graham,

Thank you very much for your reply. But, I'm afraid I wasn't as clear in my
question. The information that I'm seeking is text column number, not the
current column number in the line that's known by the character index from
the start of the line.

Every page in the documents I'm processing contains two text-columns, that's
considered a page-setup / page-layout . so I need to know if a range is in
the right or left page column.

So, please help me that.

Best Regards,

Ahmed Bedeer.


"Graham Mayor" wrote:

> Text column information would seem to be a bit pointless with proportionally
> spaced fonts. It is essentially the character count from the start of the
> current 'line' to the cursor position + 1. Problem is 'line' is not a
> parameter of the current range. What constitutes a 'line' in the document
> you are evaluating.
> e.g. if each 'line' is a paragraph, you could move the start of the range
> from the cursor position to the start of the paragraph and measure the
> length of the range. The following is a vba equivalent of that
>
> Dim oRng As Range
> Dim iCol As Integer
> Set oRng = Selection.Range
> oRng.Start = oRng.Paragraphs(1).Range.Start
> iCol = Len(oRng) + 1
>
>
> --
> <>>< ><<> ><<> <>>< ><<> <>>< <>><<>
> Graham Mayor - Word MVP
>
> My web site www.gmayor.com
> Word MVP web site http://word.mvps.org
> <>>< ><<> ><<> <>>< ><<> <>>< <>><<>
>
>
>
> "Ahmed Bedeer" <Ahmed Bedeer(a)discussions.microsoft.com> wrote in message
> news:CE332D94-3253-4389-BC56-E4AC4D48FF0D(a)microsoft.com...
> > Dear All,
> >
> > I'm working on a project where I need to extract text information from
> > some
> > word documents. I need to know each character(range)'s page number, line
> > number and text-column number.
> >
> > I use Microsoft Office Interop for Word assembly in C#.
> >
> > I could find the page number, and line number easily using the
> > range.get_Information() method, but I could not find the information of
> > text-column number.
> >
> > Note: I mean the text-column NOT the column number in a table.
> >
> > and here is a portion of the code:
> >
> > //
> > // After the document is opened using app.Documents.Open()
> > //
> >
> > foreach (Microsoft.Office.Interop.Word.Range range in document.Words)
> > {
> > string text = range.Text;
> >
> > lineNo =
> > (int)range.get_Information(Microsoft.Office.Interop.Word.WdInformation.wdFirstCharacterLineNumber);
> > pageNo =
> > (int)range.get_Information(Microsoft.Office.Interop.Word.WdInformation.wdActiveEndPageNumber);
> >
> > float lineY =
> > (float)range.Characters.Last.get_Information(Microsoft.Office.Interop.Word.WdInformation.wdVerticalPositionRelativeToPage);
> > float wordX =
> > (float)range.get_Information(Microsoft.Office.Interop.Word.WdInformation.wdHorizontalPositionRelativeToPage);
> >
> > //.. Here I need to know the text-column number of the current range.
> > //
> > }
> >
> > please help,
> >
> > Thanks in advance
>
>
> .
>
From: Jay Freedman on
Hi Ahmed,

As far as I know, Word doesn't make the text column available
directly. However, your code has already obtained the horizontal
position (as the variable wordX). All you need to do now is determine
whether that is less than or greater than half of the page width. Both
measurements must be in the same units (probably points).

--
Regards,
Jay Freedman
Microsoft Word MVP FAQ: http://word.mvps.org
Email cannot be acknowledged; please post all follow-ups to the
newsgroup so all may benefit.

On Thu, 18 Feb 2010 05:05:01 -0800, Ahmed Bedeer
<AhmedBedeer(a)discussions.microsoft.com> wrote:

>Hi Graham,
>
>Thank you very much for your reply. But, I'm afraid I wasn't as clear in my
>question. The information that I'm seeking is text column number, not the
>current column number in the line that's known by the character index from
>the start of the line.
>
>Every page in the documents I'm processing contains two text-columns, that's
>considered a page-setup / page-layout . so I need to know if a range is in
>the right or left page column.
>
>So, please help me that.
>
>Best Regards,
>
>Ahmed Bedeer.
>
>
>"Graham Mayor" wrote:
>
>> Text column information would seem to be a bit pointless with proportionally
>> spaced fonts. It is essentially the character count from the start of the
>> current 'line' to the cursor position + 1. Problem is 'line' is not a
>> parameter of the current range. What constitutes a 'line' in the document
>> you are evaluating.
>> e.g. if each 'line' is a paragraph, you could move the start of the range
>> from the cursor position to the start of the paragraph and measure the
>> length of the range. The following is a vba equivalent of that
>>
>> Dim oRng As Range
>> Dim iCol As Integer
>> Set oRng = Selection.Range
>> oRng.Start = oRng.Paragraphs(1).Range.Start
>> iCol = Len(oRng) + 1
>>
>>
>> --
>> <>>< ><<> ><<> <>>< ><<> <>>< <>><<>
>> Graham Mayor - Word MVP
>>
>> My web site www.gmayor.com
>> Word MVP web site http://word.mvps.org
>> <>>< ><<> ><<> <>>< ><<> <>>< <>><<>
>>
>>
>>
>> "Ahmed Bedeer" <Ahmed Bedeer(a)discussions.microsoft.com> wrote in message
>> news:CE332D94-3253-4389-BC56-E4AC4D48FF0D(a)microsoft.com...
>> > Dear All,
>> >
>> > I'm working on a project where I need to extract text information from
>> > some
>> > word documents. I need to know each character(range)'s page number, line
>> > number and text-column number.
>> >
>> > I use Microsoft Office Interop for Word assembly in C#.
>> >
>> > I could find the page number, and line number easily using the
>> > range.get_Information() method, but I could not find the information of
>> > text-column number.
>> >
>> > Note: I mean the text-column NOT the column number in a table.
>> >
>> > and here is a portion of the code:
>> >
>> > //
>> > // After the document is opened using app.Documents.Open()
>> > //
>> >
>> > foreach (Microsoft.Office.Interop.Word.Range range in document.Words)
>> > {
>> > string text = range.Text;
>> >
>> > lineNo =
>> > (int)range.get_Information(Microsoft.Office.Interop.Word.WdInformation.wdFirstCharacterLineNumber);
>> > pageNo =
>> > (int)range.get_Information(Microsoft.Office.Interop.Word.WdInformation.wdActiveEndPageNumber);
>> >
>> > float lineY =
>> > (float)range.Characters.Last.get_Information(Microsoft.Office.Interop.Word.WdInformation.wdVerticalPositionRelativeToPage);
>> > float wordX =
>> > (float)range.get_Information(Microsoft.Office.Interop.Word.WdInformation.wdHorizontalPositionRelativeToPage);
>> >
>> > //.. Here I need to know the text-column number of the current range.
>> > //
>> > }
>> >
>> > please help,
>> >
>> > Thanks in advance
>>
>>
>> .
>>
From: Ahmed Bedeer on
Hi Jay,

Yes, I've already tried that using the wordX, but I needed something direct
and more accurate as getting wordX sometimes leads to a Microsoft Word 2002
crash.

Anyway, Thank you very much.

Regards,

Ahmed.

"Jay Freedman" wrote:

> Hi Ahmed,
>
> As far as I know, Word doesn't make the text column available
> directly. However, your code has already obtained the horizontal
> position (as the variable wordX). All you need to do now is determine
> whether that is less than or greater than half of the page width. Both
> measurements must be in the same units (probably points).
>
> --
> Regards,
> Jay Freedman
> Microsoft Word MVP FAQ: http://word.mvps.org
> Email cannot be acknowledged; please post all follow-ups to the
> newsgroup so all may benefit.
>
> On Thu, 18 Feb 2010 05:05:01 -0800, Ahmed Bedeer
> <AhmedBedeer(a)discussions.microsoft.com> wrote:
>
> >Hi Graham,
> >
> >Thank you very much for your reply. But, I'm afraid I wasn't as clear in my
> >question. The information that I'm seeking is text column number, not the
> >current column number in the line that's known by the character index from
> >the start of the line.
> >
> >Every page in the documents I'm processing contains two text-columns, that's
> >considered a page-setup / page-layout . so I need to know if a range is in
> >the right or left page column.
> >
> >So, please help me that.
> >
> >Best Regards,
> >
> >Ahmed Bedeer.
> >
> >
> >"Graham Mayor" wrote:
> >
> >> Text column information would seem to be a bit pointless with proportionally
> >> spaced fonts. It is essentially the character count from the start of the
> >> current 'line' to the cursor position + 1. Problem is 'line' is not a
> >> parameter of the current range. What constitutes a 'line' in the document
> >> you are evaluating.
> >> e.g. if each 'line' is a paragraph, you could move the start of the range
> >> from the cursor position to the start of the paragraph and measure the
> >> length of the range. The following is a vba equivalent of that
> >>
> >> Dim oRng As Range
> >> Dim iCol As Integer
> >> Set oRng = Selection.Range
> >> oRng.Start = oRng.Paragraphs(1).Range.Start
> >> iCol = Len(oRng) + 1
> >>
> >>
> >> --
> >> <>>< ><<> ><<> <>>< ><<> <>>< <>><<>
> >> Graham Mayor - Word MVP
> >>
> >> My web site www.gmayor.com
> >> Word MVP web site http://word.mvps.org
> >> <>>< ><<> ><<> <>>< ><<> <>>< <>><<>
> >>
> >>
> >>
> >> "Ahmed Bedeer" <Ahmed Bedeer(a)discussions.microsoft.com> wrote in message
> >> news:CE332D94-3253-4389-BC56-E4AC4D48FF0D(a)microsoft.com...
> >> > Dear All,
> >> >
> >> > I'm working on a project where I need to extract text information from
> >> > some
> >> > word documents. I need to know each character(range)'s page number, line
> >> > number and text-column number.
> >> >
> >> > I use Microsoft Office Interop for Word assembly in C#.
> >> >
> >> > I could find the page number, and line number easily using the
> >> > range.get_Information() method, but I could not find the information of
> >> > text-column number.
> >> >
> >> > Note: I mean the text-column NOT the column number in a table.
> >> >
> >> > and here is a portion of the code:
> >> >
> >> > //
> >> > // After the document is opened using app.Documents.Open()
> >> > //
> >> >
> >> > foreach (Microsoft.Office.Interop.Word.Range range in document.Words)
> >> > {
> >> > string text = range.Text;
> >> >
> >> > lineNo =
> >> > (int)range.get_Information(Microsoft.Office.Interop.Word.WdInformation.wdFirstCharacterLineNumber);
> >> > pageNo =
> >> > (int)range.get_Information(Microsoft.Office.Interop.Word.WdInformation.wdActiveEndPageNumber);
> >> >
> >> > float lineY =
> >> > (float)range.Characters.Last.get_Information(Microsoft.Office.Interop.Word.WdInformation.wdVerticalPositionRelativeToPage);
> >> > float wordX =
> >> > (float)range.get_Information(Microsoft.Office.Interop.Word.WdInformation.wdHorizontalPositionRelativeToPage);
> >> >
> >> > //.. Here I need to know the text-column number of the current range.
> >> > //
> >> > }
> >> >
> >> > please help,
> >> >
> >> > Thanks in advance
> >>
> >>
> >> .
> >>
> .
>