Ask Your Question
0

Periodic System Freezes

asked 2019-03-25 11:13:57 -0600

m.drahcir gravatar image

updated 2019-03-25 12:29:27 -0600

On a regular basis (approximately every half hour), my system locks itself up. During these freezes the hard drive spins wildly and both dual core processors run right at 100%. Gnome-System-Monitor does not register what is using the resources. These lock ups use so much of the system resources that I am unable to do anything until they stop. The mouse pointer barely moves and clicking anything doesn't enact until after the system calms down again. It usually takes about five minutes for the system to go through whatever it is doing and releases the hold it has on both the processors and the hard drive. However, sometimes it locks up so badly that I have to hard boot the machine in the middle of whatever it is doing. With the hard drive maxed out, this is obviously quite dangerous.

I simply CANNOT have this happening. To say getting locked out of my system could be catastrophic is a huge understatement. Last night my system was in screensaver (xscreensaver - blank screen) and the hard drive was running maxed out. I could not get into the system and had to hard boot at 10:35. This happens way too often.

The logs show very little unfortunately. I will post system information below along with the logs from last night before it was hard booted. I use XFCE as a desktop environment to minimize system load.

Please help. I need to have this system stable and operational at all times. Have been using Fedora since Core 1. Only the later versions have been doing this. I upgraded from FC25 hoping it would help, but it has not.

Linux localhost.localdomain 4.18.19-100.fc27.x86_64 #1 SMP Wed Nov 14 22:04:34 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Getting SMBIOS data from sysfs. SMBIOS 2.3 present.

Handle 0x0001, DMI type 1, 27 bytes System Information Manufacturer: Supermicro Product Name: H8DC8 Version: 1234567890 Serial Number: 1234567890 UUID: (omitted) Wake-up Type: Power Switch SKU Number: To Be Filled By O.E.M. Family: To Be Filled By O.E.M.

Handle 0x0028, DMI type 15, 35 bytes System Event Log Area Length: 4 bytes Header Start Offset: 0x0000 Header Length: 2 bytes Data Start Offset: 0x0002 Access Method: Indexed I/O, one 16-bit index port, one 8-bit data port Access Address: Index 0x046A, Data 0x046C Status: Invalid, Not Full Change Token: 0x00000000 Header Format: No Header Supported Log Type Descriptors: 6 Descriptor 1: End of log Data Format 1: OEM-specific Descriptor 2: End of log Data Format 2: OEM-specific Descriptor 3: End of log Data Format 3: OEM-specific Descriptor 4: End of log Data Format 4: OEM-specific Descriptor 5: End of log Data Format 5: OEM-specific Descriptor 6: End of log Data Format 6: OEM-specific

Handle 0x002F, DMI type 32, 20 bytes System Boot Information Status: No errors detected

Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 2 On-line CPU(s) list: 0,1 Thread ... (more)

edit retag flag offensive close merge delete

Comments

nothing you provides spreads some light on your problem. One thing: Fedora 27 had have it's EOL . Upgrade to F28 and we may be able to help.

What may help to analyse it: CTRL+ALT+F3 => login as root , run "top -c" , CTRL+ALT+F1 back to your desktop and wait untill it happens. CTRL+ALT+F3 back into your Textwindow of top and check it out.

rdtcustomercare gravatar imagerdtcustomercare ( 2019-03-25 15:18:19 -0600 )edit

May I suggest a cold shutdown, then remove and reinsert the memory modules and the CPUs? What's the peak temp on them?

K7AAY gravatar imageK7AAY ( 2019-03-25 17:53:50 -0600 )edit

9 Answers

Sort by » oldest newest most voted
0

answered 2019-04-16 11:10:35 -0600

m.drahcir gravatar image

Okay, after stopping TOR and deleting Gnomesoftware and packagekit the lockups have been less. However, they persist. Still having issue with SSSD_KCM, as the following log shows. Any ideas?

Apr 16 11:33:25 localhost.localdomain audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=sssd-kcm comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Apr 16 11:33:35 localhost.localdomain audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-coredump@6-8098-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Apr 16 11:33:04 localhost.localdomain CROND[8134]: (root) CMD (/usr/sbin/iotop -bo --iter=10 >> /var/log/iotop) Apr 16 11:33:04 localhost.localdomain systemd[1]: Started Process Core Dump (PID 8098/UID 0). -- Subject: Unit systemd-coredump@6-8098-0.service has finished start-up -- Defined-By: systemd

-- Support: https://lists.freedesktop.org/mailman...

-- Unit systemd-coredump@6-8098-0.service has finished starting up.

-- The start-up result is done. Apr 16 11:33:24 localhost.localdomain systemd[1]: sssd-kcm.service: Main process exited, code=dumped, status=6/ABRT Apr 16 11:33:24 localhost.localdomain systemd[1]: sssd-kcm.service: Unit entered failed state. Apr 16 11:33:24 localhost.localdomain systemd[1]: sssd-kcm.service: Failed with result 'core-dump'. Apr 16 11:33:25 localhost.localdomain systemd[1]: Started SSSD Kerberos Cache Manager. -- Subject: Unit sssd-kcm.service has finished start-up -- Defined-By: systemd

-- Support: https://lists.freedesktop.org/mailman...

-- Unit sssd-kcm.service has finished starting up.

-- The start-up result is done. Apr 16 11:33:30 localhost.localdomain sssd[kcm][8151]: Starting up Apr 16 11:33:30 localhost.localdomain systemd-coredump[8133]: Process 29198 (sssd_kcm) of user 0 dumped core.

                                                          Stack trace of thread 29198:
                                                          #0  0x00007f96cda8b750 raise (libc.so.6)
                                                          #1  0x00007f96cda8cd31 abort (libc.so.6)
                                                          #2  0x00007f96ce29f3ac talloc_abort (libtalloc.so.2)
                                                          #3  0x00007f96ce29ee58 _talloc_free (libtalloc.so.2)
                                                          #4  0x000055f3965e73bb schedule_fd_processing (sssd_kcm)
                                                          #5  0x00007f96d268702c update_timer (libcurl.so.4)
                                                          #6  0x00007f96d26889e4 curl_multi_add_handle (libcurl.so.4)
                                                          #7  0x000055f3965e7c43 tcurl_request_send (sssd_kcm)
                                                          #8  0x000055f3965e8508 tcurl_http_send (sssd_kcm)
                                                          #9  0x000055f3965da4a9 sec_list_send (sssd_kcm)
                                                          #10 0x000055f3965da96f ccdb_sec_list_send (sssd_kcm)
                                                          #11 0x000055f3965d55a2 kcm_ccdb_list_send (sssd_kcm)
                                                          #12 0x000055f3965e018e kcm_op_get_cache_uuid_list_send (sssd_kcm)
                                                          #13 0x000055f3965de6f3 kcm_cmd_queue_done (sssd_kcm)
                                                          #14 0x00007f96ce4b59c4 tevent_common_loop_immediate (libtevent.so.0)
                                                          #15 0x00007f96ce4ba54b epoll_event_loop_once (libtevent.so.0)
                                                          #16 0x00007f96ce4b8ba7 std_event_loop_once (libtevent.so.0)
                                                          #17 0x00007f96ce4b4fed _tevent_loop_once (libtevent.so.0)
                                                          #18 0x00007f96ce4b520b tevent_common_loop_wait (libtevent.so.0)
                                                          #19 0x00007f96ce4b8b47 std_event_loop_wait (libtevent.so.0)
                                                          #20 0x00007f96d1fff763 server_loop (libsss_util.so)
                                                          #21 0x000055f3965d2a97 main (sssd_kcm)
                                                          #22 0x00007f96cda77fea __libc_start_main (libc.so.6)
                                                          #23 0x000055f3965d2c5a _start (sssd_kcm)

-- Subject: Process 29198 (sssd_kcm) dumped core -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman...

-- Documentation: man:core(5)

-- Process 29198 (sssd_kcm) crashed and dumped core.

-- This usually indicates a programming error in the crashing program and -- should be reported to its vendor as a bug. Apr 16 11:34:02 localhost.localdomain CROND[8180]: (root) CMD (/usr/sbin/iotop -bo --iter=10 >> /var/log/iotop) Apr 16 11:35:01 localhost.localdomain CROND[8252]: (root) CMD (/usr/sbin/iotop ... (more)

edit flag offensive delete link more
0

answered 2019-04-07 16:34:24 -0600

m.drahcir gravatar image

System locked up today at about 17:25. Results from journalctl -xef from just before to just after incident:

Apr 07 17:23:59 localhost.localdomain systemd[1]: Started Hostname Service. -- Subject: Unit systemd-hostnamed.service has finished start-up -- Defined-By: systemd

-- Support: https://lists.freedesktop.org/mailman...

-- Unit systemd-hostnamed.service has finished starting up.

-- The start-up result is done. Apr 07 17:24:01 localhost.localdomain CROND[7565]: (root) CMD (/usr/sbin/iotop -bo --iter=10 >> /var/log/iotop) Apr 07 17:25:01 localhost.localdomain CROND[7687]: (root) CMD (/usr/sbin/iotop -bo --iter=10 >> /var/log/iotop) Apr 07 17:25:29 localhost.localdomain audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Apr 07 17:25:31 localhost.localdomain dbus-daemon[708]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.907' (uid=1000 pid=15480 comm="./firefox.real --class Tor Browser -profile TorBro" label="unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023") Apr 07 17:25:31 localhost.localdomain systemd[1]: Starting Hostname Service... -- Subject: Unit systemd-hostnamed.service has begun start-up -- Defined-By: systemd

-- Support: https://lists.freedesktop.org/mailman...

-- Unit systemd-hostnamed.service has begun starting up. Apr 07 17:25:31 localhost.localdomain dbus-daemon[708]: [system] Successfully activated service 'org.freedesktop.hostname1' Apr 07 17:25:31 localhost.localdomain systemd[1]: Started Hostname Service. -- Subject: Unit systemd-hostnamed.service has finished start-up -- Defined-By: systemd

-- Support: https://lists.freedesktop.org/mailman...

-- Unit systemd-hostnamed.service has finished starting up.

-- The start-up result is done. Apr 07 17:25:31 localhost.localdomain audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Apr 07 17:26:01 localhost.localdomain CROND[7752]: (root) CMD (/usr/sbin/iotop -bo --iter=10 >> /var/log/iotop) Apr 07 17:26:02 localhost.localdomain audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Apr 07 17:26:15 localhost.localdomain dbus-daemon[708]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.909' (uid=1000 pid=15480 comm="./firefox.real --class Tor Browser -profile TorBro" label="unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023") Apr 07 17:26:16 localhost.localdomain systemd[1]: Starting Hostname Service... -- Subject: Unit systemd-hostnamed.service has begun start-up -- Defined-By: systemd

-- Support: https://lists.freedesktop.org/mailman...

-- Unit systemd-hostnamed.service has begun starting up. Apr 07 17:26:18 localhost.localdomain dbus-daemon[708]: [system] Successfully activated service 'org.freedesktop.hostname1' Apr 07 17:26:18 localhost.localdomain audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname ... (more)

edit flag offensive delete link more

Comments

Sorry - no idea why the formatting did what it did.

m.drahcir gravatar imagem.drahcir ( 2019-04-07 16:35:04 -0600 )edit

Apr 07 17:25:31 localhost.localdomain dbus-daemon[708]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.907' (uid=1000 pid=15480 comm="./firefox.real --class Tor Browser -profile TorBro" label="unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023")

TOR seems to be your problem..

rdtcustomercare gravatar imagerdtcustomercare ( 2019-04-08 13:29:34 -0600 )edit

Apr 07 17:25:31 localhost.localdomain dbus-daemon[708]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.907' (uid=1000 pid=15480 comm="./firefox.real --class Tor Browser -profile TorBro" label="unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023")

TOR seems to be your problem..

rdtcustomercare gravatar imagerdtcustomercare ( 2019-04-08 13:29:49 -0600 )edit

Thank you rdtcustomercare. Interesting. Will keep TOR shut down for a bit and see if the problems cease.

m.drahcir gravatar imagem.drahcir ( 2019-04-08 21:32:41 -0600 )edit
0

answered 2019-04-08 13:02:00 -0600

m.drahcir gravatar image

Okay, system completely locked a short while ago. Contents from journalctl -xef (note- most of this was in red font, so it was BAD) and of course, when I went to see what process 16831 had closed so I do not know what it was:

Apr 08 13:53:13 localhost.localdomain systemd-coredump[18900]: Process 16831 (sssd_kcm) of user 0 dumped core.

                                                           Stack trace of thread 16831:
                                                           #0  0x00007f52e377f750 raise (libc.so.6)
                                                           #1  0x00007f52e3780d31 abort (libc.so.6)
                                                           #2  0x00007f52e3f933ac talloc_abort (libtalloc.so.2)
                                                           #3  0x00007f52e3f92e58 _talloc_free (libtalloc.so.2)
                                                           #4  0x0000555e823a43bb schedule_fd_processing (sssd_kcm)
                                                           #5  0x00007f52e837b02c update_timer (libcurl.so.4)
                                                           #6  0x00007f52e837c9e4 curl_multi_add_handle (libcurl.so.4)
                                                           #7  0x0000555e823a4c43 tcurl_request_send (sssd_kcm)
                                                           #8  0x0000555e823a5508 tcurl_http_send (sssd_kcm)
                                                           #9  0x0000555e823974a9 sec_list_send (sssd_kcm)
                                                           #10 0x0000555e8239796f ccdb_sec_list_send (sssd_kcm)
                                                           #11 0x0000555e823925a2 kcm_ccdb_list_send (sssd_kcm)
                                                           #12 0x0000555e8239d18e kcm_op_get_cache_uuid_list_send (sssd_kcm)
                                                           #13 0x0000555e8239b6f3 kcm_cmd_queue_done (sssd_kcm)
                                                           #14 0x00007f52e41a99c4 tevent_common_loop_immediate (libtevent.so.0)
                                                           #15 0x00007f52e41ae54b epoll_event_loop_once (libtevent.so.0)
                                                           #16 0x00007f52e41acba7 std_event_loop_once (libtevent.so.0)
                                                           #17 0x00007f52e41a8fed _tevent_loop_once (libtevent.so.0)
                                                           #18 0x00007f52e41a920b tevent_common_loop_wait (libtevent.so.0)
                                                           #19 0x00007f52e41acb47 std_event_loop_wait (libtevent.so.0)
                                                           #20 0x00007f52e7cf3763 server_loop (libsss_util.so)
                                                           #21 0x0000555e8238fa97 main (sssd_kcm)
                                                           #22 0x00007f52e376bfea __libc_start_main (libc.so.6)
                                                           #23 0x0000555e8238fc5a _start (sssd_kcm)

-- Subject: Process 16831 (sssd_kcm) dumped core -- Defined-By: systemd -- Support: https://lists.freedesktop.org/mailman...

-- Documentation: man:core(5)

-- Process 16831 (sssd_kcm) crashed and dumped core.

-- This usually indicates a programming error in the crashing program and -- should be reported to its vendor as a bug. Apr 08 13:53:13 localhost.localdomain audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-coredump@1-18877-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'

edit flag offensive delete link more

Comments

From the sssd-kcm man page: SSSD Kerberos Cache Manager "In a setup where Kerberos caches are managed by KCM, the Kerberos library (typically used through an application, like, e.g. kinit, is a “"KCM client"” and the KCM daemon is being referred to as a “"KCM server"”. The client and server communicate over a UNIX socket."

kdeinit used to hog CPU time. While it does not specifically show up in Gnome System Monitor, I believe it may still be an issue. How do I keep it from over using CPU resources?

m.drahcir gravatar imagem.drahcir ( 2019-04-08 15:08:46 -0600 )edit
0

answered 2019-04-06 11:11:14 -0600

m.drahcir gravatar image

11:24 54 system lockup. Was doing searches in Firefox. This is everything reported since then, current time: 12:07 Contents of /var/log/iotop:

Total DISK READ : 0.00 B/s | Total DISK WRITE : 0.00 B/s Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 0.00 B/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND b' 1983 be/4 nooneinparticular 0.00 B/s 0.00 B/s 0.00 % 19.35 % Telegram -noupdate' Total DISK READ : 7.14 K/s | Total DISK WRITE : 17.85 K/s Actual DISK READ: 7.14 K/s | Actual DISK WRITE: 0.00 B/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND b'13298 be/4 root 7.14 K/s 3.57 K/s 0.00 % 2.73 % python3 -s /usr/sbin/iotop -bo --iter=10' b'12953 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.04 % [kworker/0:1-events]' b'12683 be/4 root 0.00 B/s 14.28 K/s 0.00 % 0.00 % krusader --left=/etc/cron.d --right=/home/nooneinparticular' Total DISK READ : 0.00 B/s | Total DISK WRITE : 0.00 B/s Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 0.00 B/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND b'12953 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.55 % [kworker/0:1-events]' b'13276 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.08 % [kworker/0:2-events]' Total DISK READ : 0.00 B/s | Total DISK WRITE : 0.00 B/s Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 0.00 B/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND b'12953 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.65 % [kworker/0:1-events]' b'13276 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.09 % [kworker/0:2-events]' Total DISK READ : 0.00 B/s | Total DISK WRITE : 22.03 K/s Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 29.37 K/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND b' 457 be/3 root 0.00 B/s 22.03 K/s 0.00 % 2.13 % [jbd2/dm-0-8]' Total DISK READ : 0.00 B/s | Total DISK WRITE : 0.00 B/s Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 0.00 B/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND b'13276 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.83 % [kworker/0:2-events]' Total DISK READ : 0.00 B/s | Total DISK WRITE : 0.00 B/s Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 0.00 B/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND b ... (more)

edit flag offensive delete link more
0

answered 2019-04-02 15:50:49 -0600

m.drahcir gravatar image

Current log contents: 12:35:44 cupsd: Job stopped due to filter errors; please consult the error_log file for details. 12:35:44 kernel: usblp 1-2:1.1: usblp0: USB Bidirectional printer dev 2 if 1 alt 0 proto 2 vid 0x03F0 pid 0x5512 12:35:40 kernel: usblp0: removed 10:09:56 kernel: [Hardware Error]: cache level: L3/GEN, mem/io: MEM, mem-tx: RD, part-proc: SRC (no timeout) 10:09:56 kernel: [Hardware Error]: cache level: L3/GEN, mem/io: MEM, mem-tx: RD, part-proc: SRC (no timeout) 10:09:56 kernel: [Hardware Error]: MC4 Error (node 1): DRAM ECC error detected on the NB. 10:09:56 kernel: [Hardware Error]: Error Addr: 0x0000000093c379a8 10:09:56 kernel: [Hardware Error]: CPU:1 (f:25:1) MC4_STATUS[-|CE|-|-|AddrV|CECC]: 0x9447c00100000813 10:09:56 kernel: [Hardware Error]: Corrected error, no action required. 10:09:56 kernel: [Hardware Error]: cache level: L3/GEN, mem/io: MEM, mem-tx: RD, part-proc: SRC (no timeout) 10:09:56 kernel: EDAC MC1: 1 CE on unknown memory (csrow:0 channel:1 page:0x93c37 offset:0x9a8 grain:0 syndrome:0x8f) 10:09:56 kernel: [Hardware Error]: MC4 Error (node 1): DRAM ECC error detected on the NB. 10:09:56 kernel: [Hardware Error]: MC4 Error (node 1): DRAM ECC error detected on the NB. 10:09:56 kernel: [Hardware Error]: Error Addr: 0x0000000093c379a8 10:09:56 kernel: [Hardware Error]: CPU:1 (f:25:1) MC4_STATUS[-|CE|-|-|AddrV|CECC]: 0x9447c00100000813 10:09:56 kernel: [Hardware Error]: Corrected error, no action required. 10:09:56 kernel: mce: [Hardware Error]: Machine check events logged 10:04:29 kernel: [Hardware Error]: cache level: L3/GEN, mem/io: MEM, mem-tx: RD, part-proc: SRC (no timeout) 10:04:29 kernel: [Hardware Error]: MC4 Error (node 1): DRAM ECC error detected on the NB. 10:04:29 kernel: [Hardware Error]: Error Addr: 0x0000000098d3d928 10:04:29 kernel: [Hardware Error]: CPU:1 (f:25:1) MC4_STATUS[-|CE|-|-|AddrV|CECC]: 0x9447c00100000813 10:04:29 kernel: [Hardware Error]: Corrected error, no action required. 10:04:29 kernel: [Hardware Error]: cache level: L3/GEN, mem/io: MEM, mem-tx: RD, part-proc: SRC (no timeout) 10:04:29 kernel: EDAC MC1: 1 CE on unknown memory (csrow:0 channel:1 page:0x98d3d offset:0x928 grain:0 syndrome:0x8f) 10:04:29 kernel: [Hardware Error]: MC4 Error (node 1): DRAM ECC error detected on the NB. 10:04:29 kernel: [Hardware Error]: MC4 Error (node 1): DRAM ECC error detected on the NB. 10:04:29 kernel: [Hardware Error]: Error Addr: 0x0000000098d3d928 10:04:29 kernel: [Hardware Error]: CPU:1 (f:25:1) MC4_STATUS[-|CE|-|-|AddrV|CECC]: 0x9447c00100000813 10:04:29 kernel: [Hardware Error]: Corrected error, no action required. 10:04:29 kernel: mce: [Hardware Error]: Machine check events logged 10:03:47 kernel: EXT4-fs (sdc1): mounted filesystem with ordered data mode. Opts: (null) 10:03:47 kernel: EXT4-fs (sdc1): mounted filesystem with ordered data mode. Opts: (null) 10:03:47 kernel: EXT4-fs (sdc1): warning: maximal mount count reached, running e2fsck ... (more)

edit flag offensive delete link more
0

answered 2019-04-01 09:24:44 -0600

m.drahcir gravatar image

My system had a small outburst at 10:11 03. Here is the output from IOTOP: /usr/sbin/iotop -bo --iter=10' Total DISK READ : 0.00 B/s | Total DISK WRITE : 0.00 B/s Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 0.00 B/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND b'12953 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.65 % [kworker/0:1-events]' Total DISK READ : 18.16 K/s | Total DISK WRITE : 0.00 B/s Actual DISK READ: 18.16 K/s | Actual DISK WRITE: 0.00 B/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND b' 2555 be/7 nooneinparticular 18.16 K/s 0.00 B/s 0.00 % 2.11 % firefox -contentproc -childID 1 -isForBrowser -intPrefs 5:50|6:-1|18:0|28:1000|33:40|34:40|43:128|44:10000|49:0|51:400|52:1|53:0|54:0|59:0|60:120|61:120|92:2|93:1|107:5000|118:0|120:0|131:10000|155:24|156:32768|158:0|159:0|167:5|171:1048576|172:100|173:5000|175:600|176:4|177:1|186:1|200:60000| -boolPrefs 1:0|2:0|4:0|26:1|27:1|30:0|35:1|36:0|37:0|38:0|41:1|42:1|45:0|46:0|47:0|48:0|50:0|55:1|56:1|57:0|58:1|62:1|63:1|64:0|65:1|66:1|67:0|68:1|71:0|72:0|75:1|76:1|80:1|81:1|82:0|83:0|84:0|86:0|87:0|88:1|89:0|94:1|95:0|101:0|106:0|109:1|110:0|112:1|113:1|115:1|119:0|121:0|123:0|125:1|126:1|132:0|133:0|134:0|136:0|153:0|154:1|157:1|160:1|162:1|164:1|165:0|170:0|174:1|179:0|180:0|181:0|182:1|183:0|184:0|185:1|188:1|192:0|193:0|194:1|195:1|196:0|197:1|198:1|199:1|201:0|202:0|204:0|212:0|213:1|214:0|215:0|216:0| -stringPrefs 3:7;release|135:3;1.0|151:332; \xc2\xa0\xc2\xbc\xc2\xbd\xc2\xbe\xc7\x83\xcb\x90\xcc\xb7\xcc\xb8\xd6\x89\xd6\x8a\xd7\x83\xd7\xb4\xd8\x89\xd8\x8a\xd9\xaa\xdb\x94\xdc\x81\xdc\x82\xdc\x83\xdc\x84\xe1\x85\x9f\xe1\x85\xa0\xe1\x9c\xb5\xe2\x80\x80\xe2\x80\x81\xe2\x80\x82\xe2\x80\x83\xe2\x80\x84\xe2\x80\x85\xe2\x80\x86\xe2\x80\x87\xe2\x80\x88\xe2\x80\x89\xe2\x80\x8a\xe2\x80\x8b\xe2\x80\x8e\xe2\x80\x8f\xe2\x80\x90 ... (more)

edit flag offensive delete link more

Comments

I believe this means rdtcustomercare was right. To me, it seems Firefox and rsyslogd get into an IO fight and lock up the system, though, I am certainly no expert at this or I wouldn't be having these problems. Is this the case? If so, how do I fix it, short of turning my computer into an artificial reef?

m.drahcir gravatar imagem.drahcir ( 2019-04-01 09:46:56 -0600 )edit

Trying the IONICE command to see if it helps by putting a leash on Firefox's IO scheduling. Will let you know if it works.

ionice -p 2 none: prio 4 [nooneinparticular@localhost ~]$ ionice -c3 -p 2382 [nooneinparticular@localhost ~]$ ionice -c3 -p 2555 [nooneinparticular@localhost ~]$ ionice -c3 -p 2081 [nooneinparticular@localhost ~]$ ionice -c3 -p 2079

m.drahcir gravatar imagem.drahcir ( 2019-04-01 16:01:14 -0600 )edit

ohhh yes... we are near it \o/ .. abrt-dump-journal-xorg and firefox .. that means, firefox is crashing in a deadloop and abrt-dump-journal-xorg is writing the crash messages to the disk.

all you now need to do is simple: stop abrtd and look inside the journal, what actually happend to FireFox. There is also a /var/spool/abrt directory, you may find infos there too.

Than we see, how we can fix this.

As a quick fix, to stop io wasted in case auditmessages are involved too, execute as root "auditctl -e 0" it temp disables the audit daemon.

rdtcustomercare gravatar imagerdtcustomercare ( 2019-04-01 16:21:35 -0600 )edit

BTW: why are you using pacman with fedora ?

rdtcustomercare gravatar imagerdtcustomercare ( 2019-04-01 16:23:06 -0600 )edit

Thank You rdtcustomercare. Okay, I auditctl -e 0 ed. When trying to view the journal, I get: "Failed to obtain all required information from journald". Going to /varspool/abrt gives me an empty coredump file and nothing useful.

I have no idea where pacman came from. I do not recall installing it, though Amazon did deliver the lemur a shipment of pomegranates which I did not order...

m.drahcir gravatar imagem.drahcir ( 2019-04-01 18:48:03 -0600 )edit
0

answered 2019-03-31 17:33:35 -0600

m.drahcir gravatar image

Latest update - today at Five PM, I could hear my hard drive maxed out, spinning uncontrollably. I tried to log in, out of xscreensaver, blank screen (uses very little system resources). The login prompt never appeared. The system continued to be maxed out FOR THE NEXT HOUR! At Six PM, I had to hard boot the system. This is completely out of control and should never happen. There are times I need immediate access to the information on this system and cannot have this happening. Several times during this hour I tried to CTRL+ALT+F3, but nothing. It never came up. Just blank screen, hard drive out of control.

I love Fedora. Have been with it since Core 1. I do not want to have to switch operating systems, but this is unacceptable. Please, suggest something. If I cannot get this fixed, I am going to have to ditch Fedora.

edit flag offensive delete link more
0

answered 2019-03-31 11:00:04 -0600

m.drahcir gravatar image

Please forgive the delay. I have tried several times to CTRL+ALT+F3 and log in as root. Problem is, the system will not let anything happen when it locks up. By the time I can finally log in as root, there is nothing out of the ordinary. System lock up happened while I was typing this - GKRELLM froze the clock (11:04 14 for the last six minutes), CPU and hard drive monitors maxed - the hard drive was run at anywhere from 40 - 75 MB per second (mostly read, only some write) when this happens.

Back to CTRL+ALT+F3 - last time I was able to do it, it showed: 10:32:12 up 1 day. 11:57, 1 user, load average: 0.58, 2.62, 2.04 Tasks: 233 total, 2 running, 175 sleeping, 0 stopped, 0 zombie, Mem: 4032940 total, 282352 free, 3334396 used, 416192 buff/cache, swap: 4177916 total, 2128120 free, 2049796 used, 779396 avail Mem

Typically, I have the following things open/running in the system: Firefox, latest edition (anywhere from 3 to 15 tabs open) - mostly text based websites without animation or scripts running; Evolution; GKRELLM; Gnome System Monitor; a terminal window; several LibreOffice documents; Telegram - a chat program (note: this has been going on long before the install of Telegram); one to three windows of Eye of Gnome (picture viewer) and on a semi regular basis these items may be open: Tor and xmms.

System froze again here, while writing this. Only locked up for four minutes this time. Fortunately, I have not had to hard boot it in a bit.

Obviously, having the computer inaccessible for large portions of time while I need to be using it is not acceptable. To me, it seems as if the system is writing a log of everything that has happened since the last time it went through the process, but I have no proof of that. Is this possible? If so, can I turn it off so the computer can be used without interruption?

edit flag offensive delete link more

Comments

Your systems shows all signs of an IO problem due to OOM or extensive io usage. As your swap is active, an oom is possible. IO problems can also be caused, if a million small files are written to the drive. Best Example is the "Runes of Magic" update process, which lags out a SSD drive and leads the same problems, you descripe.

Solution: We need to identify it with IOTOP . iotop can be run inside a cronjob and it can write to a logfile. Set one up pls. Make it "iotop --options >logfile;sleep 10;iotop --options >logfile;sleep 10;iotop --options >logfile;sleep 10; aso." or make a small bash.sh

rdtcustomercare gravatar imagerdtcustomercare ( 2019-03-31 13:51:07 -0600 )edit

Thank you rdtcustomercare. I will work on IOTOP first few free moments I have. Will have to research it as I have never used it before. As soon as I know something, I will post here. I really appreciate the suggestion and everyone's help.

m.drahcir gravatar imagem.drahcir ( 2019-03-31 18:17:58 -0600 )edit
0

answered 2019-03-25 18:22:49 -0600

m.drahcir gravatar image

Thank you. Will do the CTRL+ALT+F3 when I have a few minutes to go through the information it returns and post here. I do plan to upgrade as quickly as possible. Work has been non-stop and an upgrade usually takes me a week or so to get everything back the way I need it.

Also, thank you- I have actually pulled all memory, wiped down contacts with CRC226 and reinstalled. Pulled processors, cleaned them up, new Arctic Silver and back in they went... about a month ago. Right now Core 0 is running at 105.8F and Core 1 is 93.2F. When the processors peg out they can get up to high 120s, but rarely over 130.

edit flag offensive delete link more

Question Tools

Stats

Asked: 2019-03-25 11:13:57 -0600

Seen: 154 times

Last updated: Mar 31 '19