Home » Linux » How to handle the Linux socket revents POLLERR, POLLHUP and POLLNVAL?

How to handle the Linux socket revents POLLERR, POLLHUP and POLLNVAL?

Posted by: admin January 30, 2018 Leave a comment

Questions:

I’m wondering what should be done when poll set these bits? Close socket, ignore it or what?

Answers:

A POLLHUP means the socket is no longer connected. In TCP, this means FIN has been received and sent.

A POLLERR means the socket got an asynchronous error. In TCP, this typically means a RST has been received or sent. If the file descriptor is not a socket, POLLERR might mean the device does not support polling.

For both of the conditions above, the socket file descriptor is still open, and has not yet been closed (but shutdown() may have already been called). A close() on the file descriptor will release resources that are still being reserved on behalf of the socket. In theory, it should be possible to reuse the socket immediately (e.g., with another connect() call).

A POLLNVAL means the socket file descriptor is not open. It would be an error to close() it.

Questions:
Answers:

It depend on the exact error nature. Use getsockopt() to see the problem:

int error = 0;
socklen_t errlen = sizeof(error);
getsockopt(fd, SOL_SOCKET, SO_ERROR, (void *)&error, &errlen);

Values: http://www.xinotes.net/notes/note/1793/

The easiest way is to assume that the socket is no longer usable in any case and close it.

Questions:
Answers:

POLLNVAL means that the file descriptor value is invalid. It usually indicates an error in your program, but you can rely on poll returning POLLNVAL if you’ve closed a file descriptor and you haven’t opened any file since then that might have reused the descriptor.

POLLERR is similar to error events from select. It indicates that a read or write call would return an error condition (e.g. I/O error). This does not include out-of-band data which select signals via its errorfds mask but poll signals via POLLPRI.

POLLHUP basically means that what’s at the other end of the connection has closed its end of the connection. POSIX describes it as

The device has been disconnected. This event and POLLOUT are mutually-exclusive; a stream can never be writable if a hangup has occurred.

This is clear enough for a terminal: the terminal has gone away (same event that generates a SIGHUP: the modem session has been terminated, the terminal emulator window has been closed, etc.). POLLHUP is never sent for a regular file. For pipes and sockets, it depends on the operating system. Linux sets POLLHUP when the program on the writing end of a pipe closes the pipe, and sets POLLIN|POLLHUP when the other end of a socket closed the socket, but POLLIN only for a socket shutdown. Recent *BSD set POLLIN|POLLUP when the writing end of a pipe closes the pipe, and the behavior for sockets is more variable.

Questions:
Answers:

Minimal FIFO example

Once you understand when those conditions happen, it should be easy to know what to do with them.

#define _XOPEN_SOURCE 700
#include <fcntl.h> /* creat, O_CREAT */
#include <poll.h> /* poll */
#include <stdio.h> /* printf, puts, snprintf */
#include <stdlib.h> /* EXIT_FAILURE, EXIT_SUCCESS */
#include <unistd.h> /* read */

int main(void) {
    char buf[1024];
    int fd, n;
    short revents;
    struct pollfd pfd;

    fd = open("poll0.tmp", O_RDONLY | O_NONBLOCK);
    pfd.fd = fd;
    pfd.events = POLLIN;
    while (1) {
        puts("loop");
        poll(&pfd, 1, -1);
        revents = pfd.revents;
        if (revents & POLLIN) {
            n = read(pfd.fd, buf, sizeof(buf));
            printf("POLLIN n=%d buf=%.*s\n", n, n, buf);
        }
        if (revents & POLLHUP) {
            printf("POLLHUP\n");
            close(pfd.fd);
            pfd.fd *= -1;
        }
        if (revents & POLLNVAL) {
            printf("POLLNVAL\n");
        }
        if (revents & POLLERR) {
            printf("POLLERR\n");
        }
    }
}

Compile with:

gcc -o poll.out -std=c99 poll.c

Usage:

sudo mknod -m 666 poll0.tmp p
./poll.out

On another shell:

printf a >poll0.tmp

POLLHUP

If you don’t modify the source: ./poll.out outputs:

loop
POLLIN n=1 buf=a
loop
POLLHUP
loop

So:

  • POLLIN happens when input becomes available
  • POLLHUP happens when the file is closed by the printf
  • close(pfd.fd); and pfd.fd *= -1; clean things up, and we stop receiving POLLHUP
  • poll hangs forever

This is the normal operation.

You could now repoen the FIFO to wait for the next open, or exit the loop if you are done.

POLLNAL

If you comment out pfd.fd *= -1;: ./poll.out prints:

POLLIN n=1 buf=a
loop
POLLHUP
loop
POLLNVAL
loop
POLLNVAL
...

and loops forever.

So:

  • POLLIN and POLLHUP and close happened as before
  • since we didn’t set pfd.fd to a negative number, poll keeps trying to use the fd that we closed
  • this keeps returning POLLNVAL forever

So we see that this shouldn’t have happened, and indicates a bug in your code.

POLLERR

I don’t know how to generate a POLLERR with FIFOs. Let me know if there is way. But it should be possible with file_operations of a device driver.

Tested in Ubuntu 14.04.