From: Dmitry Bond. on
Hi.

I'm testing some our software on Linux/Oracle VM. It is a CentOS Linux
5.4 and Oracle 11.x.
It is client/server software, client part running on Windows (2000/XP/
7), server part (in this case) running on CentOS Linux.

There is an application on client side which use the back-connection
approach when requesting big chunk of data from server. In particular
client application opens the listening socket on some (random) port
number and send own IP + that port number to a server. Expecting that
server will open a back-connection to a specified IP/port and write
all data.

I have faced very strange problems with such back connections. All the
time it reports a error code 111 (ECONNREFUSED). According to a
''man'' pages it means - "No one listening on the remote address". But
that is not true! Because:
a) there are no firewalls on a way! I turned off firewall on client
and on server computers.
b) when I run client application under debugger I can clearly see that
it starts to listen on specified port, also - I see an open listening
connection in a TaskInfo.
c) and one more (the main) thing - I can connect specified ip/port
with a telnet from server and client application receives the data I'm
typing in a telnet (while run it under debug I started new putty to
linux env and run telnet back to specified ip+port - it works fine).
d) under debugger and in log files I can clearly see that all
connection parameters specified correctly - I see a correct IP and
port and telnet working fine to that IP+port.

So, if I can connect Linux and then can connect my computer from Linux
and see that all required listening connections are open and running
then I make a conclusion - it looks like something wrong with socket
options on Linux.
Maybe something with Oracle, maybe with application running on
Linux...

Question is - why the connect(...) call returns a ECONNREFUSED ?
What can I also check/validate/test to make it working?
Any ideas/tools you can recommend?

Please look a code opening back connection (enclosed).

Regards,
Dmitry.

PS. The same problem exists on Fedora Linux 12 and CentOS 5.5, so for
sure - it looks like a common thing for Linuxes (or maybe - for Linux
+Oracle).

PPS. We have server part ported to Windows+DB2 and some other
platforms - all working fine there. So, the same approach with back-
connection working fine for many years on many platforms. Only now
(when porting it to Linux/Oracle) we faced problems with it.

PPPS. Here is the example of code used to open back-connection from
server to client:

static struct sockaddr_in out_sock;
static int out_fd;
static int isConnected = 0;

int backconnection_open( const char *addr, int port )
{
int err;

memset(&out_sock, 0, sizeof(out_sock));
out_sock.sin_addr.s_addr = inet_addr(addr);
if (out_sock.sin_addr.s_addr == INET_ERROR)
{
LogError("Unable to convert IP address %s from dotted to internal
format.", addr);
return 10001;
};

out_sock.sin_family = AF_INET;
out_sock.sin_port = (unsigned short)port;

out_fd = socket( (short)AF_INET, SOCK_STREAM, 0 );
if (out_fd < 0)
{
LogError("Unable to create socket - %d", errno);
return 10002;
};

err = connect( out_fd, (struct sockaddr*)&out_sock,
(int)sizeof(out_sock) );
if (err)
{
err = errno;
LogError("Unable to connect to remote address %s(%08X) port %d tcp/
ip error %d",
addr, (long)out_sock.sin_addr.s_addr, (int)out_sock.sin_port,
err);
return 10003;
};

isConnected = 1;
return 0;
}
From: Lew Pitcher on
On June 3, 2010 13:36, in comp.os.linux.misc, dima_ben(a)ukr.net wrote:

> Hi.
>
> I'm testing some our software on Linux/Oracle VM. It is a CentOS Linux
> 5.4 and Oracle 11.x.
> It is client/server software, client part running on Windows (2000/XP/
> 7), server part (in this case) running on CentOS Linux.
>
> There is an application on client side which use the back-connection
> approach when requesting big chunk of data from server. In particular
> client application opens the listening socket on some (random) port
> number and send own IP + that port number to a server. Expecting that
> server will open a back-connection to a specified IP/port and write
> all data.
>
> I have faced very strange problems with such back connections. All the
> time it reports a error code 111 (ECONNREFUSED). According to a
> ''man'' pages it means - "No one listening on the remote address". But
> that is not true! Because:
> a) there are no firewalls on a way! I turned off firewall on client
> and on server computers.
> b) when I run client application under debugger I can clearly see that
> it starts to listen on specified port, also - I see an open listening
> connection in a TaskInfo.
> c) and one more (the main) thing - I can connect specified ip/port
> with a telnet from server and client application receives the data I'm
> typing in a telnet (while run it under debug I started new putty to
> linux env and run telnet back to specified ip+port - it works fine).
> d) under debugger and in log files I can clearly see that all
> connection parameters specified correctly - I see a correct IP and
> port and telnet working fine to that IP+port.
>
> So, if I can connect Linux and then can connect my computer from Linux
> and see that all required listening connections are open and running
> then I make a conclusion - it looks like something wrong with socket
> options on Linux.

Probably.

> Maybe something with Oracle, maybe with application running on
> Linux...
>
> Question is - why the connect(...) call returns a ECONNREFUSED ?

For SOCK_DGRAM, the target system sends an ICMP error message
For SOCK_STREAM, the target system sends an RST (I think).

In either case, the target system's TCP/IP stack is telling the sender's
TCP/IP stack that there is nothing to connect to. And the sender's stack
relays that back to the sender application via the ECONNREFUSED status

> What can I also check/validate/test to make it working?

The raw data as it goes across the wire, for one.

> Any ideas/tools you can recommend?

tcpdump and/or wireshark come to mind.

> Please look a code opening back connection (enclosed).
>
> Regards,
> Dmitry.

Some possibilities to think about:
1) your sending app is sending a different protocol (PF_*) or different type
(SOCK_*) than your receiving app is expecting.
2) your sending app is sending to a different machine or port than your
receiving app is expecting.

From the looks of your code, below, I'd suspect #2 - you aren't sending to
the address/port that you /think/ you are sending to.

> PS. The same problem exists on Fedora Linux 12 and CentOS 5.5, so for
> sure - it looks like a common thing for Linuxes (or maybe - for Linux
> +Oracle).
>
> PPS. We have server part ported to Windows+DB2 and some other
> platforms - all working fine there. So, the same approach with back-
> connection working fine for many years on many platforms. Only now
> (when porting it to Linux/Oracle) we faced problems with it.
>
> PPPS. Here is the example of code used to open back-connection from
> server to client:
>
> static struct sockaddr_in out_sock;
> static int out_fd;
> static int isConnected = 0;
>
> int backconnection_open( const char *addr, int port )
> {
> int err;
>
> memset(&out_sock, 0, sizeof(out_sock));
> out_sock.sin_addr.s_addr = inet_addr(addr);
> if (out_sock.sin_addr.s_addr == INET_ERROR)
> {
> LogError("Unable to convert IP address %s from dotted to internal
> format.", addr);
> return 10001;
> };
>
> out_sock.sin_family = AF_INET;
> out_sock.sin_port = (unsigned short)port;

sin_port should be in network byte order, and the (unsigned short) cast is
unnecessary.

What you want is
out_sock.sin_port = htons(port);
assuming that port is in host byte order.

If port /is/ in host byte order, then this might be the cause of your
problem. On x86 compatable systems, host byte order is "little endian".

But, network byte order is always "big endian".

So, without the htons(), your host int port is incorrectly formatted, and
will direct the connection/conversation to a /different/ port than you
expect it to. Probably, /that/ port is "closed" (has no service open behind
it), and the target TCP/IP stack is passing on the RST that indicates that
the connection to /that/ port is refused.

>
> out_fd = socket( (short)AF_INET, SOCK_STREAM, 0 );

Additionally, the cast here
(short)AF_INET
is redundant and wrong. AF_INET is /defined/ to fit into the int domain
value that is the first argument of socket(). Casting it to short (which is
then implicitly expanded back to int as part of the process of preparing
the function call) is at best redundant, and at worst can discard some
significance. In this case, AF_INET fits properly within a short, but had
it been some other value, this cast would have discarded enough so that the
resulting value would not be the same as AF_INET.

> if (out_fd < 0)
> {
> LogError("Unable to create socket - %d", errno);
> return 10002;
> };
>
> err = connect( out_fd, (struct sockaddr*)&out_sock,
> (int)sizeof(out_sock) );
> if (err)
> {
> err = errno;
> LogError("Unable to connect to remote address %s(%08X) port %d tcp/
> ip error %d",
> addr, (long)out_sock.sin_addr.s_addr, (int)out_sock.sin_port,

Again, both sin_addr.sin_addr and sin_port are stored in network byte ("big
endian") order. Your printf() will report their values incorrectly, as, on
an x86 platform, it expects it's integer arguments to be in "little endian"
order.

> err);
> return 10003;
> };
>
> isConnected = 1;
> return 0;
> }

HTH
--
Lew Pitcher
Master Codewright & JOAT-in-training | Registered Linux User #112576
Me: http://pitcher.digitalfreehold.ca/ | Just Linux: http://justlinux.ca/
---------- Slackware - Because I know what I'm doing. ------


From: Dmitry Bond. on
On Jun 3, 9:35 pm, Lew Pitcher <lpitc...(a)teksavvy.com> wrote:
> On June 3, 2010 13:36, in comp.os.linux.misc, dima_...(a)ukr.net wrote:
>
[...]
> What you want is
>     out_sock.sin_port = htons(port);
> assuming that port is in host byte order.
>

YES!!! :-))))
Thank you very much!
Exactly htons() did solve the problem!

But also I have rebuilt and tested it a Windows environment and faced
a strange thing - the same code (but with htons()) DOES NOT WORK on
Windows?! :-\
Of course I have added #ifdef _WIN32 ... #else ... #endif and now the
same code works fine on Linux and on Windows but it looks strange.
Perhaps Windows sockets are a bit "different" than sockets on other
platforms...
From: Lew Pitcher on
On June 4, 2010 12:49, in comp.os.linux.misc, dima_ben(a)ukr.net wrote:

> On Jun 3, 9:35 pm, Lew Pitcher <lpitc...(a)teksavvy.com> wrote:
>> On June 3, 2010 13:36, in comp.os.linux.misc, dima_...(a)ukr.net wrote:
>>
> [...]
>> What you want is
>> out_sock.sin_port = htons(port);
>> assuming that port is in host byte order.
>>
>
> YES!!! :-))))
> Thank you very much!
> Exactly htons() did solve the problem!
>
> But also I have rebuilt and tested it a Windows environment and faced
> a strange thing - the same code (but with htons()) DOES NOT WORK on
> Windows?! :-\

Somehow, that doesn't surprise me.

OTOH, Many of the MSDN "how to write network code" documents that I looked
at (prior to answering) include the htons() call as part of their example
code. See
http://msdn.microsoft.com/en-us/library/ms737625(VS.85).aspx
or
http://msdn.microsoft.com/en-us/library/ms738557(VS.85).aspx
or
http://msdn.microsoft.com/en-us/library/3thek09d(VS.80).aspx
for example.

> Of course I have added #ifdef _WIN32 ... #else ... #endif and now the
> same code works fine on Linux and on Windows but it looks strange.
> Perhaps Windows sockets are a bit "different" than sockets on other
> platforms...

Of course it is.

Linux uses the sockets interface and and TCP/IP networking stack derived
from and based on the design of BSD Unix, and standardized by the Open
Group as part of the POSIX and Single Unix Specification standards. Linux
was designed, more or less, from the start to do sockets-based TCP/IP
networking.

OTOH, Microsoft Windows uses a sockets interface originally tacked on by a
third party ("Trumpet WinSock"), with limitations from that "afterthought"
addition and features specifically designed for Microsoft Windows. While
the original MSWindows (native, not third-party) TCP/IP stack was derived
from BSD, the current list of Windows systems support a native,
Microsoft-written TCP/IP stack, which "suffers" from the quirks
and "enhancements" that Microsoft added.

--
Lew Pitcher
Master Codewright & JOAT-in-training | Registered Linux User #112576
Me: http://pitcher.digitalfreehold.ca/ | Just Linux: http://justlinux.ca/
---------- Slackware - Because I know what I'm doing. ------


From: Dmitry Bond. on
On 3 июн, 20:35, Lew Pitcher <lpitc...(a)teksavvy.com> wrote:
> On June 3, 2010 13:36, in comp.os.linux.misc, dima_...(a)ukr.net wrote:
>

Thank you!
Do not know why but seems my 1st reply on your answer was lost. So,
writing it once again.

I found that in my case it did not work because missing htons() call
for a port number (as you mentioned):

out_sock.sin_port = htons(port);

But code with htons() does not work on Windows!
So, I have to use

#ifdef _WIN32
out_sock.sin_port = port;
#else
out_sock.sin_port = htons(port);
#endif

to keep it working on all platfroms.