I descrive a problem that I was unable to fix. It was an hardware issue.
I have a Linux server with CentOS 7 and kernel release as below.
[root@kvm1 ~]# uname -r
3.10.0-862.14.4.el7.x86_64
Suddlently the ethernet interface started to responde with discontinuity and I check the log. In the following figure the error message.
This mean tha an interrupt is lost. Probably this is a miscomunication between the hypervisor and the system in the IRQ assignment. The IRQ number is used to find the address of the interrupt vector. Qui arriva il valore di irq -1. In hardware ci sono 16 o 24 interrupt nei processori Intel. There are 256 software interrupt type possibible and everyone puo' servire piu' device. Questi si chiamano interrupt vectors. I primi 32 sono riservati per NMI. Con IRQ intendiamo l'interrupt hardware.
If I restart the server the problem appear to be fixed but, what happen?
The Linux system has a service called irqbalance that distributes hardware interrupt across processor in a multicore or multiprocessor system. The NIC interrupt use the irqbalance service. This system is named smp_affinity. You can disable this service, isolate one cpu and pinning the NIC interrupt to an isolated CPU.
You can see the file /proc/interrupt
In the row of this table you will find the match for your ethernet nic card (that one that you can see issuing the 'ipconfig' command). My system has 32 CPU and a KVM with virtual machine. There is only one TenGiga card named enp2s0
The image above is small but the last column are the ethernet interface and the column are named CPU0 to CPU32. You can see the perfect IRQ distribution between the cores of the interface named from enp2s0-TxRx-0 to enp2s0-TxRx-31. The first column are the IRQ used by the system and here are 25 to 64 for the ethernet.
The NIC use all the CPU because this is written in this file:
[root@kvm2 ~]# cat /sys/class/net/enp2s0/device/local_cpulist
0-31
The syntax "enp2s0-TxRx-0" show a network queue. In this case input/output are paired with the same interrupt. The system in an automatic way create 32 queue because there are 32 core.
[root@kvm2 net]# ethtool -g enp2s0
Ring parameters for enp2s0:
Pre-set maximums:
RX: 4096
RX Mini: 0
RX Jumbo: 0
TX: 4096
Current hardware settings:
RX: 512
RX Mini: 0
RX Jumbo: 0
TX: 512
Here is the number of queue for the NIC:
[root@kvm2 net]# ethtool -l enp2s0
Channel parameters for enp2s0:
Pre-set maximums:
RX: 0
TX: 0
Other: 1
Combined: 63
Current hardware settings:
RX: 0
TX: 0
Other: 1
Combined: 32
[root@kvm2 net]# cat /sys/class/net/enp2s0/queues/
rx-0/ rx-12/ rx-16/ rx-2/ rx-23/ rx-27/ rx-30/ rx-6/ tx-0/ tx-12/ tx-16/ tx-2/ tx-23/ tx-27/ tx-30/ tx-6/
rx-1/ rx-13/ rx-17/ rx-20/ rx-24/ rx-28/ rx-31/ rx-7/ tx-1/ tx-13/ tx-17/ tx-20/ tx-24/ tx-28/ tx-31/ tx-7/
rx-10/ rx-14/ rx-18/ rx-21/ rx-25/ rx-29/ rx-4/ rx-8/ tx-10/ tx-14/ tx-18/ tx-21/ tx-25/ tx-29/ tx-4/ tx-8/
rx-11/ rx-15/ rx-19/ rx-22/ rx-26/ rx-3/ rx-5/ rx-9/ tx-11/ tx-15/ tx-19/ tx-22/ tx-26/ tx-3/ tx-5/ tx-9/
Commenti
Posta un commento