Topic : Socket Programming
Author : Reg Quinton
Page : 1 Next >>
Go to page :


An Introduction to Socket Programming
(by) Reg Quinton <reggers@julian.uwo.ca>
$Id: socket.html,v 1.8 1997/05/02 20:17:16 reggers Exp $  





Contents:
-Introduction
-BEWARE
-Existing Services
-Netstat Observations
-Host names and IP numbers
-Programming Calls
-Services and Ports
-Programming Calls
-Socket Addressing
-File Descriptors and Sockets
-File Descriptors
-Sockets
-Client Connect
-Client Communication
-Stdio Buffers
-Server Applications
-Server Bind
-Listen and Accept
-Inetd Services
-Inetd Comments
-Whois Daemon
-Running the Daemon
-The Code
-Connecting to the Server
-Whois Client
-Perl Socket Programming
-Final Comments
-Note Well
-Suggested Reading
-Author




Introduction:
These course notes are directed at Unix application programmers who want to develop client/server applications in the TCP/IP domain (with some hints for those who want to write UDP/IP applications). Since the Berkeley socket interface has become something of a standard these notes will apply to programmers on other platforms.

Fundamental concepts are covered including network addressing, well known services, sockets and ports. Sample applications are examined with a view to developing similar applications that serve other contexts. Our goals are

to develop a function, tcpopen(server,service), to connect to service.
to develop a server that we can connect to.
This course requires an understanding of the C programming language and an appreciation of the programming environment (ie. compilers, loaders, libraries, Makefiles and the RCS revision control system). If you want to know about socket programming with perl(1) then see below but you should read everything first.

Our example is the UWO/ITS whois(1) service -- client and server sources available in:

Network Services: http://www.uwo.ca/its/network
Look for the whois(1) client and the whoisd(8) server. You'll find extensive documentation on the UWO/ITS Whois/CSO server -- that's the whoisd(8) server. It also includes some Perl clients which access the server to provide a gateway service (for the Finding People Web page and for CSO/PH clients). The Unix whois(1) client will be pretty obvious after you've read these notes.


BEWARE:
If C code scares you, then you'll get some concepts but you might be in the wrong course. You need to be a programmer to write programs (of course). This isn't an Introduction to C (or Perl)!




Existing Services:
Before starting, let's look at existing services. On a Unix machine there are usually lots of TCP/IP and UDP/IP services installed and running:

[1:17pm julian] netstat -a
Active Internet connections (including servers)
Proto R-Q S-Q  Local Address Foreign Address    (state)
tcp     0   0  julian.2717   vnet.ibm.com.smtp  ESTABLISHED
tcp     0   0  julian.smtp   uacsc2.alban.55049 TIME_WAIT
tcp     0  13  julian.nntp   watserv1.wat.3507  ESTABLISHED
tcp     0   0  julian.nntp   gleep.csd.uw.3413  ESTABLISHED
tcp     0   0  julian.telnet uwonet-serve.55316 ESTABLISHED
tcp     0   0  julian.login  no8sun.csd.u.1023  ESTABLISHED
tcp     0   0  julian.2634   Xstn15.gaul..6000  ESTABLISHED
          etc...
tcp     0   0  *.printer     *.*                LISTEN
tcp     0   0  *.smtp        *.*                LISTEN
tcp     0   0  *.waisj       *.*                LISTEN
tcp     0   0  *.account     *.*                LISTEN
tcp     0   0  *.whois       *.*                LISTEN
tcp     0   0  *.nntp        *.*                LISTEN
          etc...
udp     0   0  *.ntp         *.*
udp     0   0  *.syslog      *.*
udp     0   0  *.xdmcp       *.*


Netstat Observations:
Inter Process Communication (or IPC) is between host.port pairs (or host.service if you like). A process pair uses the connection -- there are client and server applications on each end of the IPC connection.

Note the two protocols on IP -- TCP (Transmission Control Protocol) and UDP (User Datagram Prototocol). There's a third protocl ICMP (Internet Control Message Protocol) which we'll not look at -- it's what makes IP work in the first place!

We'll be looking in more detail at TCP services and will not look at UDP -- but see a sample Access Control List client/server pair which uses UDP services, you'll find that in:


Access Control Lists: http://www.uwo.ca/its/network/security/acl
TCP services are connection orientated (like a stream, a pipe or a tty like connection) while UDP services are connectionless (more like telegrams or letters).

We recognize many of the services -- SMTP (Simple Mail Transfer Protocol as used for E-mail), NNTP (Network News Transfer Protocol service as used by Usenet News), NTP (Network Time Protocol as used by xntpd(8)), and SYSLOG is the BSD service implemented by syslogd(1M).

The netstat(1M) display shows many TCP services as ESTABLISHED (there is a connection between client.port and server.port) and others in a LISTEN state (a server application is listening at a port for client connections). You'll often see connections in a CLOSE_WAITE state -- they're waiting for the socket to be torn down.




Host names and IP numbers:
Hosts have names (eg. julian.uwo.ca) but IP addressing is by number (eg. [129.100.2.12]). In the old days name/number translations were tabled in /etc/hosts.

[2:38pm julian] page /etc/hosts
# /etc/hosts: constructed out of private data and DNS. Some machines
# need to know some things at boot time. Otherwise, rely on DNS.
#
127.0.0.1       localhost
129.100.2.12    julian.uwo.ca
129.100.2.26    backus.ccs.uwo.ca loghost.its.uwo.ca
129.100.2.33    filehost.ccs.uwo.ca
129.100.2.14    panther.uwo.ca
          etc...

These days name to number translations are implemented by the Domain Name Service (or DNS) -- see named(8). and resolv.conf(4).

[2:43pm julian] page /etc/resolv.conf
# $Author: reggers $
# $Date: 1997/05/02 20:17:16 $
# $Id: socket.html,v 1.8 1997/05/02 20:17:16 reggers Exp $
# $Source: /usr/src/usr.local/doc/courses/socket/RCS/socket.html,v $
# $Locker:  $
#
# The default /etc/resolv.conf for the ITS solaris systems.
#
nameserver 129.100.2.12
nameserver 129.100.2.51
nameserver 129.100.10.252
domain its.uwo.ca
search ncsm.its.uwo.ca its.uwo.ca uwo.ca


Programming Calls:
Programmers don't scan /etc/hosts nor do they communicate with the DNS. The C library routines gethostbyname(3) (and gethostbyaddr(3) on the same page) each return a pointer to an object with the following structure:

struct     hostent {
   char   *h_name;        /* official name */
   char   **h_aliases;    /* alias list */
   int    h_addrtype;     /* address type */
   int    h_length;       /* address length */
   char   **h_addr_list;  /* address list */
};
#define h_addr h_addr_list[0]
  /* backward compatibility */


The structure h_addr_list is a list of IP numbers (recall that a machine might have several interfaces, each will have a number).

Good programmers would try to connect to each address listed in turn (eg. some versions of ftp(1) do that). Lazy programmers (like me) just use h_addr -- the first address listed. But see the acl(1) and acld(8) example noted earlier -- the client will try each server until it gets an answer or runs out of servers to ask.

Client applications connect to a host.port (cf. netstat output) for a service provided by the application found at that address.

Proto R-Q S-Q  Local Address  Foreign Address    (state)
tcp     0   0  julian.2717    vnet.ibm.com.smtp  ESTABLISHED
tcp     0  13  julian.nntp    watserv1.wat.3507  ESTABLISHED


The connection is usually prefaced by translating a host name into an IP number (but if you knew the IP number you could carefully skip that step).

int     tcpopen(host,service)
char    *service, *host;
{
    struct  hostent         *hp;
          etc...
    if ((hp=gethostbyname(host)) == NULL) then error...


I say "carefully" because the IP address is a structure of 4 octets. Watch out for byte ordering. An unsigned long isn't the

Page : 1 Next >>