crash "watchdog"



  • Bonsoir, j’ai un crash récurrent sur mon mineur, de type “WATCHDOG” !!!
    le log dit :

    === Last 50 lines of /var/log/miner/claymore/lastrun_reboot.log ===
    16:07:04:627 f2ca9700 buf: {“jsonrpc”:“2.0”,“id”:0,“result”:[“0xfb20efb0aef42731a0b9f60763ac05c0c6bdb43e5eb3d8a1a4292a1e29875568”,“0x4e2977c9152afafb8ea63ea5434ada0692b481b8ad37d05391c3a7738d63eb5d”,“0x000000006df37f675ef6eadf5ab9a2072d44268d97df837e6748956e5c6c2116”]}
    
    16:07:04:627 f2ca9700 parse packet: 242
    16:07:04:627 f2ca9700 ETH: job is the same
    16:07:04:627 f2ca9700 new buf size: 0
    16:07:07:369 fe721700 GPU 0 temp = 71, old fan speed = 22, new fan speed = 26
    
    16:07:07:370 fe721700 GPU 1 temp = 47, old fan speed = 33, new fan speed = 25
    
    16:07:10:370 fe721700 GPU 0 temp = 71, old fan speed = 23, new fan speed = 27
    
    16:07:10:370 fe721700 GPU 1 temp = 47, old fan speed = 33, new fan speed = 25
    
    16:07:12:875 f2ca9700 got 243 bytes
    16:07:12:875 f2ca9700 buf: {“jsonrpc”:“2.0”,“id”:0,“result”:[“0x88342ad3d8e70776b3a0d50bdc90ecf56a11c1fd3ddb930823aaa8767a52292b”,“0x4e2977c9152afafb8ea63ea5434ada0692b481b8ad37d05391c3a7738d63eb5d”,“0x000000006df37f675ef6eadf5ab9a2072d44268d97df837e6748956e5c6c2116”]}
    
    16:07:12:875 f2ca9700 parse packet: 242
    16:07:12:875 f2ca9700 ETH: job changed
    16:07:12:875 f2ca9700 new buf size: 0
    16:07:12:875 f2ca9700 ETH: 03/04/18-16:07:12 - New job from eth-eu2.nanopool.org:9999
    16:07:12:875 f2ca9700 target: 0x000000006df37f67 (diff: 10000MH), epoch 173(2.35GB)
    16:07:12:875 f2ca9700 ETH - Total Speed: 11.706 Mh/s, Total Shares: 300, Rejected: 0, Time: 21:31
    16:07:12:875 f2ca9700 ETH: GPU0 0.000 Mh/s, GPU1 11.706 Mh/s
    16:07:13:371 fe721700 GPU 0 temp = 72, old fan speed = 24, new fan speed = 29
    
    16:07:13:371 fe721700 GPU 1 temp = 47, old fan speed = 33, new fan speed = 25
    
    16:07:13:371 fe721700 GPU0 t=72C fan=26%, GPU1 t=47C fan=33%
    16:07:14:568 f2ca9700 ETH: checking pool connection…
    16:07:14:568 f2ca9700 send: {“worker”: “”, “jsonrpc”: “2.0”, “params”: [], “id”: 3, “method”: “eth_getWork”}
    
    16:07:14:630 f2ca9700 got 243 bytes
    16:07:14:630 f2ca9700 buf: {“jsonrpc”:“2.0”,“id”:0,“result”:[“0x88342ad3d8e70776b3a0d50bdc90ecf56a11c1fd3ddb930823aaa8767a52292b”,“0x4e2977c9152afafb8ea63ea5434ada0692b481b8ad37d05391c3a7738d63eb5d”,“0x000000006df37f675ef6eadf5ab9a2072d44268d97df837e6748956e5c6c2116”]}
    
    16:07:14:630 f2ca9700 parse packet: 242
    16:07:14:630 f2ca9700 ETH: job is the same
    16:07:14:630 f2ca9700 new buf size: 0
    16:07:16:371 fe721700 GPU 0 temp = 72, old fan speed = 26, new fan speed = 31
    
    16:07:16:371 fe721700 GPU 1 temp = 47, old fan speed = 33, new fan speed = 25
    
    16:07:17:710 f0ca5700 srv_thr cnt: 1, IP: 127.0.0.1
    16:07:17:710 f0ca5700 recv: 51
    16:07:17:710 f0ca5700 srv pck: 50
    16:07:17:710 f0ca5700 srv bs: 0
    16:07:17:710 f0ca5700 sent: 159
    16:07:19:327 f2ca9700 send: {“id”:6,“jsonrpc”:“2.0”,“method”:“eth_submitHashrate”,“params”:[“0xb2a3c8”, “0x0000000000000000000000000000000000000000000000000000000024932c48”]}
    
    16:07:19:372 fe721700 GPU 0 temp = 72, old fan speed = 28, new fan speed = 33
    
    16:07:19:372 fe721700 GPU 1 temp = 47, old fan speed = 33, new fan speed = 25
    
    16:07:19:835 fef22700 em hbt: 1, fm hbt: 48,
    16:07:19:836 fef22700 watchdog - thread 0 (gpu0), hb time 69374
    16:07:19:836 fef22700 WATCHDOG: GPU 0 hangs in OpenCL call, exit
    16:07:19:836 fef22700 watchdog - thread 1 (gpu0), hb time 69196
    16:07:19:836 fef22700 WATCHDOG: GPU 0 hangs in OpenCL call, exit
    16:07:19:836 fef22700 watchdog - thread 2 (gpu1), hb time 44
    16:07:19:836 fef22700 watchdog - thread 3 (gpu1), hb time 201
    16:07:19:836 fef22700 Rebooting
    

    Le crash arrive environ deux fois par 24 heures en se répétant 4 a 5 fois à la suite.
    Le mineur repart ensuite pour de longues heures sans souci.

    le GPU en question possède un bios d’origine (je crois) et des fréquences stock.

    La commande amdcovc me renvoie :

    Memory Clocks: 300 1750                                                                                                
    Adapter 1: Hawaii PRO [Radeon R9 290/390]                                                                                
    Core: 1040 MHz, Mem: 1500 MHz, CoreOD: 0, MemOD: 0, Load: 100%, Temp: 57 C, Fan: 33.7255%                              
    Core clocks: 300 500 698 858 899 935 969 1040                                                                          
    Memory Clocks: 150 1500 
    

    Le GPU ne dépasse jamais les 71°C et je n’ai pas fixé de limite de minage.
    Ne fonctionnant qu’avec 2go de ram j’ai vérifié l’utilisation mémoire en minage.
    La commande “free -m” me renvoie :

                   total        used        free      shared  buff/cache   available                                        
     Mem:           1933         534        1111           7         288        1237                                         
    0Swap:             0           0           0    
    

    534mo utilisés, la RAM ne semble pas en cause.
    J’ai branché la prise molex 4 pins qui se trouve sur la carte mère a coté de la RAM et le système crashe a chaque demarrage.
    Je ne pense pas que désactiver watchdog resolve le problème mais juste maintient le minage avec le GPU en question off.
    Si quelqu’un a une idée, merci d’avance.



  • Help 🙃 🙃


Se connecter pour répondre
 

Il semble que votre connexion ait été perdue, veuillez patienter pendant que nous vous re-connectons.