Zero RX and TX Packets
My new server had been running for about 10 days. Running Slackware 14.1 32-bit. No connectivity problems. Excellent
iperf results. No problems streaming videos to my media player.
Then I was checking something — I now forget what — when I ran
ifconfig at the server. All zeroes for RX and TX packets. Likewise for
/proc/net/dev. Likewise for
The vnstat history indicated the defect existed the moment I went live with the server. I never had much reason to notice because of the great connectivity.
I use the
ifconfig transfer stats in a script to decide when to power down at night. I have been using this script for several years since I started using my office desktop for pseudo server tasks. The RX and TX stats indicate current connection transfers, such as client systems streaming music and videos, or downloading package updates through cron jobs. (I also use
netstat -an to test client connections but that is unrelated to this topic.)
The server ASRock N68C-GS4 FX motherboard uses an Atheros QCA8171 Gigabit chip set, which uses the
alx kernel module driver. Lesson number one, be more diligent when buying motherboards. Actually I had been diligent when I researched the board. I read some customer reports that the Atheros chip worked in Linux. I did not pause to think the statement needing qualification as to which kernel versions. Or that quirks existed.
Some web surfing indicated a lack of traffic counter support with that module.
There were reports of wake-on-lan not working with some newer kernel versions. I was having no problems using wake-on-lan with the motherboard. All I needed was the network stats.
Slackware 14.1 was released with the 3.10.17 kernel. The latest 3.10 version is 3.10.94.
More surfing indicated that the
alx driver network stats had been fixed, at least in the 3.14 kernel. Perhaps the related patches had been backported to the 3.10 series.
This is classic WTF material. Year of the Linux Desktop? Not likely with this kind of attitude.
I have not compiled a kernel in several years. I still had my script wrapper from a few years ago when I regularly compiled kernels on Slackware.
I interrupted my day to fix Yet Another Linux Related Bug.
I copied the Slackware 3.10.17 kernel config file to my kernel build directory. A few nominal adjustments to my old script and with a mild shake of wonderment with my head, was on my way to compiling a new kernel.
The days of 15 minute kernel compiles are long gone. Compiling the 3.10.94 kernel took an hour on my 2.3 GHz dual core system.
While compiling I looked further into the problem. After learning the stats counter problem was fixed in the 3.14 series kernel, I downloaded a copy of the 3.14.58 sources and compared them to the 3.10.94 sources. The net stats counter support had not been backported into the 3.10.94 kernel. I was wasting my time compiling.
I next thought I could find a patch to merge with the 3.10.17 sources. I did not find anything specific but I ran into something called the kernel backports. Seems this is what I needed.
I browsed the
alx sources and found references to stat counting. Good.
The general approach is nice:
- Download the latest version of backports.
cdto that directory.
As far as I can tell the default config is packaged such that nothing or very little is enabled. The trick is to enable only the needed modules and save the config file.
When I finished running
make there were two kernel modules in the source tree:
make install installed the new modules to
/lib/modules/`uname -r`/updates and runs
Next I held my breath:
rmmod alx && modprobe alx && /etc/rc.d/rc.inet1 restart
The one-liner was quick enough that my office system did not miss a beat and continued playing tunes from the server. I ran two scripts, one to test my ISP connection route and then another to sync files from the server. Finally I had
ifconfig eth0 network stats.
Yay. A short-lived yay. A cynical and sour yay.
I lost wake-on-lan support, which was working beautifully before the updated driver. Reboots and shut downs did not fix the bug.
ethtool verified the loss. There was no wake-on-lan support at all.
I found a reverse patch in this bug report. The bug report indicates the wake-on-lan support was ripped from the driver and has yet to be restored. I checked the
alx driver in the backports sources and confirmed the lack of support. The reverse patch was my only hope.
I merged the patch and recompiled the backports sources.
I verified network stats with
ifconfig and wake-on-lan support with
ethtool. A suspend to ram and a magic packet from my office computer verified wake-on-lan was again truly working.
That is how people using Linux waste, er, consume, er, spend an entire afternoon fixing bugs. Someone once defined insanity as repeating the same act and expecting different results. We Linux users are insane. We keep repeating the same act of acting as though this stuff just works.
alx kernel maintainer receive a pay check for this kind of unprofessional work?