有没有人用过Virt-manager管理qemu/kvm虚拟机呢?我升级后不能用了

这是重启服务后,连上了的情况下的日志。

# /lib/systemd/system/libvirtd.service
[Unit]
Description=Virtualization daemon
Requires=virtlogd.socket
Requires=virtlockd.socket
# Use Wants instead of Requires so that users
# can disable these three .socket units to revert
# to a traditional non-activation deployment setup
Wants=libvirtd.socket
Wants=libvirtd-ro.socket
Wants=libvirtd-admin.socket
Wants=systemd-machined.service
After=network.target
After=firewalld.service
After=iptables.service
After=ip6tables.service
After=dbus.service
After=iscsid.service
After=apparmor.service
After=local-fs.target
After=remote-fs.target
After=systemd-logind.service
After=systemd-machined.service
After=xencommons.service
Conflicts=xendomains.service
Documentation=man:libvirtd(8)
Documentation=https://libvirt.org

[Service]
Type=notify
Environment=LIBVIRTD_ARGS="--timeout 120"
EnvironmentFile=-/etc/default/libvirtd
ExecStart=/usr/sbin/libvirtd $LIBVIRTD_ARGS
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
# At least 1 FD per guest, often 2 (eg qemu monitor + qemu agent).
# eg if we want to support 4096 guests, we'll typically need 8192 FDs
# If changing this, also consider virtlogd.service & virtlockd.service
# limits which are also related to number of guests
LimitNOFILE=8192
# The cgroups pids controller can limit the number of tasks started by
# the daemon, which can limit the number of domains for some hypervisors.
# A conservative default of 8 tasks per guest results in a TasksMax of
# 32k to support 4096 guests.
TasksMax=32768
# With cgroups v2 there is no devices controller anymore, we have to use
# eBPF to control access to devices.  In order to do that we create a eBPF
# hash MAP which locks memory.  The default map size for 64 devices together
...skipping...
Wants=libvirtd-admin.socket
Wants=systemd-machined.service
After=network.target
After=firewalld.service
After=iptables.service
After=ip6tables.service
After=dbus.service
After=iscsid.service
After=apparmor.service
After=local-fs.target
After=remote-fs.target
After=systemd-logind.service
After=systemd-machined.service
After=xencommons.service
Conflicts=xendomains.service
Documentation=man:libvirtd(8)
Documentation=https://libvirt.org

[Service]
Type=notify
Environment=LIBVIRTD_ARGS="--timeout 120"
EnvironmentFile=-/etc/default/libvirtd
ExecStart=/usr/sbin/libvirtd $LIBVIRTD_ARGS
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
# At least 1 FD per guest, often 2 (eg qemu monitor + qemu agent).
# eg if we want to support 4096 guests, we'll typically need 8192 FDs
# If changing this, also consider virtlogd.service & virtlockd.service
# limits which are also related to number of guests
LimitNOFILE=8192
# The cgroups pids controller can limit the number of tasks started by
# the daemon, which can limit the number of domains for some hypervisors.
# A conservative default of 8 tasks per guest results in a TasksMax of
# 32k to support 4096 guests.
TasksMax=32768
# With cgroups v2 there is no devices controller anymore, we have to use
# eBPF to control access to devices.  In order to do that we create a eBPF
# hash MAP which locks memory.  The default map size for 64 devices together
# with program takes 12k per guest.  After rounding up we will get 64M to
# support 4096 guests.
LimitMEMLOCK=64M

[Install]
WantedBy=multi-user.target
Also=virtlockd.socket
Also=virtlogd.socket
Also=libvirtd.socket
Also=libvirtd-ro.socket
lines 11-59/59 (END)

没有问题啊。好奇怪,Type=notify 的服务主进程退出之后还是活动状态么?想找个正常的 libvirtd.service 用户,看看他们的服务里是不是有两个 dnmasq 进程。

出现这个问题的直接原因是,每次连接的时候会按需启动 libvirtd 服务。但是现在这个服务启动之后,libvirtd 进程退出,留下了两个 dnsmasq 进程,服务依旧被认为在正常运行。所以下次需要触发启动的时候,就不会有任何动作了,而实际上 libvirtd 进程早已退出,于是就卡着了。

这个“Type=notify 的服务主进程”我要怎么查看?

systemd 252 (252.6-1) 未复现。

@terbyrap 你 systemctl --version 看看 systemd 的版本?

是这个版本。

systemd 252 (252.11-1)
+PAM +AUDIT +SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT +QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK -XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified

为什么你的版本不一样啊。apt show systemd 看看?

你在用 testing?

难道我的11不比你的6更高吗?

Package: systemd
Version: 252.11-1
Priority: important
Section: admin
Maintainer: Debian systemd Maintainers <pkg-systemd-maintainers@lists.alioth.debian.org>
Installed-Size: 9,870 kB
Provides: systemd-sysusers (= 252.11-1), systemd-tmpfiles (= 252.11-1)
Pre-Depends: libblkid1 (>= 2.24), libc6 (>= 2.34), libcap2 (>= 1:2.10), libgcrypt20 (>= 1.10.0), liblz4-1 (>= 0.0~r122), liblzma5 (>= 5.1.1alpha+20120614), libmount1 (>= 2.30), libselinux1 (>= 3.1~), libssl3 (>= 3.0.0), libzstd1 (>= 1.5.2)
Depends: libacl1 (>= 2.2.23), libaudit1 (>= 1:2.2.1), libblkid1 (>= 2.24.2), libcryptsetup12 (>= 2:2.4), libfdisk1 (>= 2.33), libkmod2 (>= 15), libp11-kit0 (>= 0.23.18.1), libseccomp2 (>= 2.3.1), libsystemd-shared (= 252.11-1), libsystemd0 (= 252.11-1), mount
Recommends: default-dbus-system-bus | dbus-system-bus, systemd-timesyncd | time-daemon
Suggests: systemd-container, systemd-homed, systemd-userdbd, systemd-boot, systemd-resolved, libfido2-1, libqrencode4, libtss2-esys-3.0.2-0, libtss2-mu0, libtss2-rc0, polkitd | policykit-1
Conflicts: consolekit, libpam-ck-connector, systemd-shim
Breaks: less (<< 563), resolvconf (<< 1.83~), sicherboot (<< 0.1.6), udev (<< 247~)
Homepage: https://www.freedesktop.org/wiki/Software/systemd
Tag: admin::boot, implemented-in::c, interface::daemon, role::program,
 works-with::software:running
Download-Size: 3,025 kB
APT-Manual-Installed: yes
APT-Sources: https://mirrors.cloud.tencent.com/debian trixie/main amd64 Packages
Description: 系统和服务管理器
 systemd 是 Linux 下的一种系统和服务管理器。它具有积极的并行化功能,采用套接
 字与 D-Bus 激活方式启动服务,提供守护进程的按需启动,跟踪使用 Linux 控制组
 的进程,维护挂载和自动挂载点,实现了一个基于事务依赖关系的精细化服务控制逻辑。
 .
 仅安装 systemd 包不会切换您的初始化系统,除非您以 init=/lib/systemd/systemd 参数启动或者另外安装 systemd-sysv 包。

是的,用的 Debian 13 Trixie.

可以考虑给 Debian 发 bug 报告了。

所以是你们会发吗?

我完全不懂这些。 :pray:

我不会发,因为我又不用,没法回应「你试试……会怎么样」的请求。

PS: 这个问题在 Arch Linux 的 systemd 253 (253.5-2-arch) 上也没有复现。不确定是不是 Debian 打补丁给弄坏的。

是有两个 dnsmasq 进程,感觉你说的是对的。

不对,我这里 libvirt 进程退出后也有两个 dnsmasq 进程,但我下次启动虚拟机的时候 libvirtd 还是正常启动了。

抱歉,我又弄错了,我这里 libvirtd 只在第一次使用的时候按需启动,后面就一直在运行中了,不会退出。

又弄错了,libvirtd 在退出虚拟机后,延迟大概两分钟后退出。。。。

那我这边怎么发呢?
就等着升级,看有没有其他人处理?

除了不太熟悉报bug流程之外,我现在的网络上github和外网也很慢很慢。

不然我早开代理去电报群和你们聊了。
我都好久不上电报了。

但我试了一下,主进程退出之后服务就被认为没有在运行了。测试代码:

#!/usr/bin/python3

import os
import systemd.daemon

def main():
  os.system('sleep 120 &')
  systemd.daemon.notify('READY=1')

if __name__ == '__main__':
  main()
[Unit]
Description=test daemon

[Service]
Type=notify
ExecStart=.../t.py
KillMode=process

Debian 报 bug 我记得是发邮件。 https://bugs.debian.org/cgi-bin/pkgreport.cgi?pkg=systemd;dist=testing

1 个赞

你的 systemd 版本是多少呢?剩下那两个 dnsmasq 进程,systemd 应当依旧认为服务没有在运行。

我又更新了回复,你再看下。
我的 systemd 版本是 252.6

跟我这里一样(是没有问题的版本)。

另外编辑帖子没有通知的呀。