I think it’s related to the parent process creating new subprocess and does not have tty. Can anyone explain the detail under the hood? i.e. the related working model of bash, process creation, etc?
It may be a very broad topic so pointers to posts are also very appreciated. I’ve Googled for a while, all the results are about very specific case and none is about the story behind the scene. To provide more context, below is the shell script resulting the ‘bash: no job control in this shell’.
#! /bin/bash while [ 1 ]; do st=$(netstat -an |grep 7070 |grep LISTEN -o | uniq) if [ -z $st ]; then echo "need to start proxy @$(date)" bash -i -c "ssh -D 7070 -N [email protected] > /dev/null" else echo "proxy OK @$(date)" fi sleep 3 done
bash -i -c "ssh -D 7070 -N [email protected] > /dev/null"
is where “bash:no job control in this shell” come from.
Job control is a collection of features in the shell and the tty driver which allow the user to manage multiple jobs from a single interactive shell.
A job is a single command or a pipeline. If you run
ls, that’s a job. If you run
ls|more, that’s still just one job. If the command you run starts subprocesses of its own, then they will also belong to the same job unless they are intentionally detached.
Without job control, you have the ability to put a job in the background by adding
& to the command line. And that’s about all the control you have.
With job control, you can additionally:
- Suspend a running foreground job with CtrlZ
- Resume a suspended job in the foreground with
- Resume a suspend job in the background with
- Bring a running background job into the foreground with
The shell maintains a list of jobs whcih you can see by running the
jobs command. Each one is assigned a job number (distinct from the PIDs of the process(es) that make up the job). You can use the job number, prefixed with
%, as an argument to
bg to select a job to foreground or background. The %jobnumber notation is also acceptable to the shell’s builtin
kill command. This can be convenient because the job numbers are assigned starting from 1, so they’re shorter than PIDs.
There are also shortcuts
%+ for the most recently foregrounded job and
%- for the previously foregrounded job, so you can switch back and forth rapidly between two jobs with CtrlZ followed by
fg %- (suspend the current one, resume the other one) without having to remember the numbers. Or you can use the beginning of the command itself. If you have suspended an
ffmpeg command, resuming it is as easy as
fg %ff (assuming no other active jobs start with “ff”). And as one last shortcut, you don’t have to type the
fg. Just entering
%- as a command foregrounds the previous job.
“But why do we need this?” I can hear you asking. “I can just start another shell if I want to run another command.” True, there are many ways of multitasking. On a normal day I have login shells running on tty1 through tty10 (yes there are more than 6, you just have to activate them), one of which will be running a screen session with 4 screens in it, another might have an ssh running on it in which there is another screen session running on the remote machine, plus my X session with 3 or 4 xterms. And I still use job control.
If I’m in the middle of
aptitude or any other interactive thing, and I need to run a couple of other quick commands to decide how to proceed, CtrlZ, run the commands, and
fg is natural and quick. (In lots of cases an interactive program has a
! keybinding to run an external command for you; I don’t think that’s as good because you don’t get the benefit of your shell’s history, command line editor, and completion system.) I find it sad whenever I see someone launch a secondary xterm/screen/whatever to run one command, look at it for two seconds, and then exit.
Now about this script of yours. In general it does not appear to be competently written. The line in question:
bash -i -c "ssh -D 7070 -N [email protected] > /dev/null"
is confusing. I can’t figure out why the ssh command is being passed down to a separate shell instead of just being executed straight from the main script, let alone why someone added
-i to it. The
-i option tells the shell to run interactively, which activates job control (among other things). But it isn’t actually being used interactively. Whatever the purpose was behind the separate shell and the
-i, the warning about job control was a side effect. I’m guessing it was a hack to get around some undesirable feature of ssh. That’s the kind of thing that when you do it, you should comment it.
One of the possible options would be not having access to the tty.
Under the hood:
- bash checks whether the session is interactive, if not – no job
- if forced_interactive is set, then check that stderr is
attached to a tty is skipped and bash checks again, whether in can
/dev/ttyfor read-write access.
- then it checks whether new line discipline is used, if not, then job control is disabled too.
- If (and only if) we just set our process group to our pid, thereby becoming a process group leader, and the terminal is not in the same process group as our (new) process group, then set the terminal’s process group to our (new) process group. If that fails, set our process group back to what it was originally (so we can still read from the terminal) and turn off job control.
- if all of the above has failed, you see the message.
I partially quoted the comments from bash source code.
As per additional request of the question author:
http://tiswww.case.edu/php/chet/bash/bashtop.html Here you can find bash itself.
If you can read the C code, get the source tarball, inside it you will find
job.c – that one will explain you more “under the hood” stuff. 🙂
I ran into a problem on my own embedded system and I got rid of the “no job control” error by running the getty process with “setsid”, which according to its manpage starts a process with a new session id.
You may need to enable job control:
#! /bin/bash set -m