Compare commits

...

25 Commits

Author SHA1 Message Date
lxb 7f6c5a1dd8 修改gitignore 3 weeks ago
lxb cec4b4028a fixbug. ssh异常时继续泡池,解决不重连的问题 8 months ago
lxb be437429f3 Merge branch 'master' of http://git.lxblxb.top/lxb/Tool_CheckGPUsWeb 9 months ago
lxb 05a85a09d8 增加可以不显示gpu 9 months ago
lxbhahaha 718b26cfca 增加使用情况的按大小排序,并且过滤50MB以下的 9 months ago
lxbhahaha c6570b007b 增加用户使用情况 9 months ago
lxbhahaha d1509a5609 更新readme 9 months ago
lxbhahaha 33c96824d4 更新readme 9 months ago
lxbhahaha 4d483a2c8c 修改网速的显示 10 months ago
lxbhahaha ca9682a4af 修复:最后一条分割线删除,点击checkbox后立刻刷新 10 months ago
lxbhahaha 07d0422095 增加了显示选项的开光 10 months ago
lxbhahaha 9c3127595e 更新一下模板 10 months ago
lxbhahaha 7280ca69d3 增加网速的查看 10 months ago
lxbhahaha c116a81c2c update 10 months ago
lxbhahaha a5631fe369 修改实现方式 10 months ago
鱼骨剪 4b940a89af 修改了一下实现方式 10 months ago
鱼骨剪 7014aa37d2 修改了一下标题的显示方式 10 months ago
鱼骨剪 fad4dce56a 增加了内存的显示 10 months ago
鱼骨剪 ddb945fd5d 完善了一下存储空间的显示 10 months ago
鱼骨剪 a0f28e1a84 初步增加了存储空间的显示 10 months ago
鱼骨剪 f0904e3893 增加显示当前数据的时间,更详细的错误信息 11 months ago
鱼骨剪 3e4ee65cc0 update 12 months ago
鱼骨剪 f1d627f718 增加了颜色显示 12 months ago
鱼骨剪 62aad057cb 初步实现了服务器数据的显示 12 months ago
鱼骨剪 8b26843851 初步实现获取服务器GPU数据 12 months ago
  1. 5
      .gitignore
  2. 94
      README.md
  3. 289
      app.py
  4. 305
      index.html
  5. BIN
      pics/demo.png
  6. 21
      serverList_examlpe.json

5
.gitignore

@ -0,0 +1,5 @@
**/__pycache__/
__pycache__/
.vscode/
serverList.json

94
README.md

@ -0,0 +1,94 @@
# 1. 简介
在网页上同时查看多个服务器的信息(网络、内存、硬盘、显卡)
大致原理是后端的python程序通过ssh连接服务器,定期通过终端解析获取所需数据存在字典中,然后前端网页定期获取字典的内容进行可视化。
![](pics/demo.png)
# 2. 安装
## 2.1. 运行环境
即运行后端程序所需的环境,可在conda中安装虚拟环境,linux和windows都可以。
```bash
pip install flask flask-cors paramiko -i https://pypi.tuna.tsinghua.edu.cn/simple
```
## 2.2. 服务器环境
即需要被查看的服务器上所安装的环境。
因为本质上是通过ssh连接服务器,然后通过命令来获取相应的信息,有的命令可能服务器系统上不自带需要另外安装,否则无法获取到对应的数据。
- **ifstat**,用于获取网络数据的工具,可通过apt安装(如果不需要显示网络数据则不用安装)。并且需要在服务器上运行一下命令,查看哪个网卡才是主要的,写到配置文件里去(如果不需要查看网络信息可以不写)。
- **gpustat**,用于获取显卡上用户的使用情况,也可通过apt安装。
- **nvidia驱动**,需要需要安装N卡的驱动,能够通过`nvidia-smi`来获取显卡信息即可(AMD的应该就没办法了)。
其中这个ifstat查看网卡的步骤如下:通过apt安装完成之后,在终端输入`ifstat`,可以看到类似下面的输出(ctrl+c停止),因为一般会不只一个网卡,而且名称也会不一样。此时可以看一下哪个名称的网卡有数据变化,比如下方的就是`eno2`,可以写到配置文件里。
```
eno1 eno2 br-6c8650526aef docker0 veth1d3300f
KB/s in KB/s out KB/s in KB/s out KB/s in KB/s out KB/s in KB/s out KB/s in KB/s out
0.00 0.00 3.31 1.96 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 2.23 1.52 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 7.56 8.03 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 4.00 4.55 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 3.66 0.19 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 8.34 8.26 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 8.25 4.78 0.00 0.00 0.00 0.00 0.00 0.00
```
## 2.3. 后端部署
安装好运行环境且设置好配置文件后,直接开一个screen,然后在目录下运行`python app.py`即可。(需要确保当前机器能够访问到所需要监视的服务器)
需要注意的是,app.py内最后几行可以找到`app.run(debug=True, host='127.0.0.1', port=port)`这行代码,可以将debug改为`False`,host可以改为`0.0.0.0`(在云服务器上部署时貌似需要改为这个),port可以按需修改。并且需要在防火墙上打开对应端口。可修改`check_interval`变量,默认为2,代表检测一次服务器信息的间隔。
其中配置文件默认名称为`serverList.json`,需要自己创建,格式参考`serverList_example.json`,具体规则如下:
- title:服务器名称,用于显示。
- ip:服务器ip地址,用于连接。
- port:访问的端口,一般是22,如果访问容器等则按需修改。
- username:用于登录的账户名称。
- password:用于登录的账户密码。
- key_filename:用于登录的账户密钥**路径**。(password和key_filename只需要设置一个即可,如果服务器只能使用密钥登陆则填密钥即可)
- network_interface_name:网卡名称。(非必须项,如果不需要可视化网速则不需要设置)
- storage_list:需要查看存储空间使用情况的路径list。(非必须项,无论有没有设置都会默认检查根目录的使用情况)
```json
{
"title": "SERVER_76",
"ip": "123.123.123.76",
"port": 22,
"username": "lxb",
"password": "abcdefg",
"key_filename": "/home/.ssh/id_rsa",
"network_interface_name": "eno2",
"storage_list": [
"/media/D",
"/media/F"
]
}
```
开启运行之后,如果`serverList.json`有修改,需要重新启动app.py才能生效。
## 2.4. 网页部署
可以使用docker运行一个nginx的容器来简单的部署这个网页。
首先安装docker,安装完之后可执行命令`docker run -d -p 80:80 -v /home/lxb/nginx_gpus:/usr/share/nginx/html --name nginx_gpus nginx:latest`,注意**按需修改命令**,具体可修改内容如下。
```bash
docker run -d \
-p <宿主机上映射的端口>:80 \
-v <宿主机上数据卷的位置>:/usr/share/nginx/html \
--name <容器名称> \
nginx:latest
```
**另外需要**将index.html中的fetchData函数内的地址替换为对应后端的ip+端口。(`fetch('<替换这里>/all_data')`)
然后把`index.html`放入数据卷中,替换掉原来的。然后访问主机`ip:映射的端口号`,如`123.123.123.123:80`(默认8080的话可以不输入)即可打开网页。
另外可以修改setInterval的时间,即多久访问一次后端,建议时间不要小于后端的check_interval,不然经常获取的也是没有更新的数据浪费了。
```javascript
// 页面加载时获取数据并定时刷新
document.addEventListener('DOMContentLoaded', function() {
fetchData();
setInterval(fetchData, 3000); // 每3秒刷新一次数据
});
```
有域名的话也可以搞一个反向代理,可参考 [服务器上使用Nginx部署网页+反向代理](http://blog.lxblxb.top/archives/1723257245091)。
# 3. 其他
- `永辉`帮忙搞了一下顶部checkbox布局的问题。
- 参考`治鹏`的方法加了每张显卡的用户使用的情况。

289
app.py

@ -1,7 +1,9 @@
from flask import Flask, jsonify from flask import Flask, jsonify
from datetime import datetime
from flask_cors import CORS from flask_cors import CORS
import threading import threading
import paramiko import paramiko
import json
import time import time
#region 全局 #region 全局
@ -9,6 +11,11 @@ import time
app = Flask(__name__) app = Flask(__name__)
CORS(app) CORS(app)
port = 15002 port = 15002
server_list_path = 'serverList.json'
data_list_lock = threading.Lock()
check_interval = 2
# 共享list
data_dict = dict()
#endregion #endregion
@ -19,10 +26,9 @@ port = 15002
def hello(): def hello():
return 'hi. —— CheckGPUsWeb' return 'hi. —— CheckGPUsWeb'
@app.route('/data', methods=['GET']) @app.route('/all_data', methods=['GET'])
def get_data(): def get_data():
data = {'name': 'John', 'age': 25, 'city': 'New York'} return jsonify(get_all_data())
return jsonify(data)
# 开始连接服务器 # 开始连接服务器
def connect_server(): def connect_server():
@ -30,9 +36,284 @@ def connect_server():
#endregion #endregion
def get_gpus_info(client, timeout, info_list:list=None, ignore_gpu=False):
if ignore_gpu:
return None
try:
cmd = 'nvidia-smi --query-gpu=index,name,memory.total,memory.used,memory.free,utilization.gpu,utilization.memory,temperature.gpu --format=csv'
stdin, stdout, stderr = client.exec_command(cmd, timeout=timeout)
output = stdout.read().decode()
output = output.split('\n')
start_idx = 0
for i in range(len(output)):
if output[i] == 'index, name, memory.total [MiB], memory.used [MiB], memory.free [MiB], utilization.gpu [%], utilization.memory [%], temperature.gpu':
start_idx = i + 1
break
output = output[start_idx:-1]
# 解析数据 -----------------------------
result = []
for data in output:
data_list = data.split(', ')
idx = int(data_list[0])
gpu_name = data_list[1]
total_mem = int(data_list[2].split(' ')[0])
used_mem = int(data_list[3].split(' ')[0])
free_mem = int(data_list[4].split(' ')[0])
util_gpu = int(data_list[5].split(' ')[0])
util_mem = int(data_list[6].split(' ')[0])
temperature = int(data_list[7])
# 简化GPU名称
if gpu_name.startswith('NVIDIA '):
gpu_name = gpu_name[7:]
if gpu_name.startswith('GeForce '):
gpu_name = gpu_name[8:]
result.append({
'idx': idx,
'gpu_name': gpu_name,
'total_mem': total_mem,
'used_mem': used_mem,
'free_mem': free_mem,
'util_gpu': util_gpu,
'util_mem': util_mem,
'temperature': temperature,
'users': {}
})
# 读取用户使用信息
try:
gpustat_cmd = 'gpustat --json'
stdin, stdout, stderr = client.exec_command(gpustat_cmd, timeout=timeout)
gpustat_output = stdout.read().decode()
# 确保 gpustat 输出不是空的
if not gpustat_output:
raise ValueError("gpustat did not return any output.")
gpustat_info = json.loads(gpustat_output)
# 确保解析的 gpustat 信息格式正确
if 'gpus' not in gpustat_info:
raise ValueError("Parsed gpustat info does not contain 'gpus' key.")
# 解析进程信息 -----------------------------
for gpu in gpustat_info['gpus']:
idx = gpu['index']
processes = gpu.get('processes', []) # 使用 get() 方法避免 KeyError
for process in processes:
username = process['username']
gpu_memory_usage = process['gpu_memory_usage'] # 占用的显存
# 找到对应的 GPU,将用户及其显存使用情况记录下来
for gpu_result in result:
if gpu_result['idx'] == idx:
if username not in gpu_result['users']:
gpu_result['users'][username] = 0
gpu_result['users'][username] += gpu_memory_usage
except Exception as e:
if info_list is not None:
info_list.append(f'gpu user: {e}')
return result
except paramiko.ssh_exception.SSHException as e:
# ssh 的异常仍然抛出
raise
except Exception as e:
if info_list is not None:
info_list.append(f'gpus: {e}')
return None
def get_storage_info(client, timeout, path_list, info_list:list=None):
try:
result = []
for target_path in path_list:
stdin, stdout, stderr = client.exec_command(f'df {target_path} | grep \'{target_path}\'', timeout=timeout)
output = stdout.read().decode()
if output == "":
continue
data = output.split()
tmp_res = {
"path": target_path,
"total": int(data[1]),
"available": int(data[3])
}
result.append(tmp_res)
return result
except paramiko.ssh_exception.SSHException as e:
# ssh 的异常仍然抛出
raise
except Exception as e:
if info_list is not None:
info_list.append(f'storage: {e}')
return None
def get_memory_info(client, timeout, info_list:list=None):
try:
stdin, stdout, stderr = client.exec_command('free', timeout=timeout)
output = stdout.read().decode().split('\n')[1]
if output == "":
return None
data = output.split()
result = {
"total": int(data[1]),
"used": int(data[2])
}
return result
except paramiko.ssh_exception.SSHException as e:
# ssh 的异常仍然抛出
raise
except Exception as e:
if info_list is not None:
info_list.append(f'memory: {e}')
return None
def get_network_info(client, timeout, interface_name, info_list:list=None):
try:
if interface_name is None:
return None
stdin, stdout, stderr = client.exec_command(f'ifstat -i {interface_name} 0.1 1', timeout=timeout)
output = stdout.read().decode().split('\n')[2]
data = output.split()
result = {
"in": float(data[0]),
"out": float(data[1])
}
return result
except paramiko.ssh_exception.SSHException as e:
# ssh 的异常仍然抛出
raise
except Exception as e:
if info_list is not None:
info_list.append(f'network: {e}')
return None
# 持续获取一个服务器的信息
def keep_check_one(server: dict, shared_data_list: dict, server_title: str, interval: float, re_connect_time: float=5):
# 处理一下需要检查的存储空间路径
if not 'storage_list' in server:
server['storage_list'] = []
if not '/' in server['storage_list']:
server['storage_list'].insert(0, '/')
re_try_count = 0
# 循环连接
while True:
try:
# 建立SSH连接
client = paramiko.SSHClient()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect(server['ip'], port=server['port'], username=server['username'], password=server.get('password', None), key_filename=server.get('key_filename', None), timeout=interval*3)
shared_data_list[server_title]['err_info'] = None
re_try_count = 0
# 循环检测
keep_run = True
while keep_run:
try:
error_info_list = []
# gpu 信息
gpu_info = get_gpus_info(client, interval*3, info_list=error_info_list, ignore_gpu=server.get('ignore_gpu', False))
# 存储空间信息
storage_info = get_storage_info(client, interval*3, server['storage_list'], info_list=error_info_list)
# 内存信息
memory_info = get_memory_info(client, interval*3, info_list=error_info_list)
# 网络信息
network_info = get_network_info(client, interval*3, server.get('network_interface_name', None), info_list=error_info_list)
# 记录信息
with data_list_lock:
shared_data_list[server_title]['gpu_info_list'] = gpu_info
shared_data_list[server_title]['storage_info_list'] = storage_info
shared_data_list[server_title]['memory_info'] = memory_info
shared_data_list[server_title]['network_info'] = network_info
shared_data_list[server_title]['updated'] = True
shared_data_list[server_title]['maxGPU'] = len(gpu_info) if gpu_info is not None else 0
if len(error_info_list) > 0:
shared_data_list[server_title]['err_info'] = '\n'.join(error_info_list)
except Exception as e:
keep_run = False
shared_data_list[server_title]['err_info'] = f'{e}'
if 'gpu_info_list' in shared_data_list[server_title]:
shared_data_list[server_title].pop('gpu_info_list')
time.sleep(interval)
# 关闭连接
client.close()
except Exception as e:
shared_data_list[server_title]['err_info'] = f'retry:{re_try_count}, {e}'
time.sleep(re_connect_time)
re_try_count += 1
# 获取所有的服务器数据
def get_all_data():
return filter_data(list(data_dict.keys()))
# 根据key过滤所需的服务器数据
def filter_data(title_list: list):
result = dict()
server_data = dict()
for title in title_list:
server_data[title] = {}
# 不存在该title的数据
if title not in data_dict:
server_data[title]['err_info'] = f'title \'{title}\' not exist!'
continue
# 记录数据 ----------------------------------------------------
data_updated = data_dict[title].get('updated', False)
# 是否更新
server_data[title]['updated'] = data_updated
# 报错信息
err_info = data_dict[title].get('err_info', None)
if err_info is not None:
server_data[title]['err_info'] = err_info
# 显卡
gpu_info_list = data_dict[title].get('gpu_info_list', None)
if gpu_info_list is not None:
server_data[title]['gpu_info_list'] = gpu_info_list
# 硬盘
storage_info_list = data_dict[title].get('storage_info_list', None)
if storage_info_list is not None:
server_data[title]['storage_info_list'] = storage_info_list
# 内存
memory_info = data_dict[title].get('memory_info', None)
if memory_info is not None:
server_data[title]['memory_info'] = memory_info
# 网络
network_info = data_dict[title].get('network_info', None)
if network_info is not None:
server_data[title]['network_info'] = network_info
result['time'] = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
result['server_data'] = server_data
return result
def start_connect():
# 加载json
with open(server_list_path, 'r') as f:
server_list = json.load(f)
global data_dict
# 开启线程
for i, server_data in enumerate(server_list):
data_dict[server_data['title']] = {}
data_dict[server_data['title']]['server_data'] = server_data
thread = threading.Thread(target=keep_check_one, args=(server_data, data_dict, server_data['title'], check_interval))
thread.daemon = True
thread.start()
print('start connect')
# 测试 # 测试
def test(): def test():
app.run(debug=True, port=port) start_connect()
app.run(debug=True, host='127.0.0.1', port=port)
if __name__ == '__main__': if __name__ == '__main__':
test() test()

305
index.html

@ -3,26 +3,315 @@
<head> <head>
<meta charset="UTF-8"> <meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0"> <meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Fetch JSON from Flask</title> <title>Server and GPU Information</title>
<style>
.card {
margin: 10px;
padding: 10px;
border: 1px solid #ccc;
border-radius: 5px;
width: 300px;
display: inline-block;
vertical-align: top;
}
.server-name {
font-weight: bold;
margin-bottom: 5px;
font-size: 24px; /* 调整字体大小 */
background-color: black; /* 背景色设为黑色 */
color: white; /* 文字颜色设为白色 */
padding: 10px; /* 增加内边距使其更美观 */
border-radius: 5px; /* 可选:增加圆角效果 */
}
.gpu-info {
margin-top: 10px;
border: 1px solid #ccc; /* 边框 */
border-radius: 8px; /* 圆角 */
padding: 10px; /* 内边距 */
margin-bottom: 15px; /* 下边距 */
background-color: #f9f9f9; /* 背景颜色 */
box-shadow: 0 2px 5px rgba(0, 0, 0, 0.1); /* 阴影 */
}
.user-info {
margin-top: 10px; /* 上边距 */
font-size: 14px; /* 字体大小 */
color: #555; /* 字体颜色 */
}
.user-item {
color: #007bff; /* 用户名颜色 */
font-weight: bold; /* 加粗 */
}
/* 头部样式 */
.head_contrainer{
display: flex;
flex-direction: row;
justify-content: space-between;
height: 90px;
align-items: center;
}
.head_contrainer .checkboxes{
display: flex;
align-items: center;
}
</style>
</head> </head>
<body> <body>
<h1>Fetch JSON from Flask Example</h1> <div class="head_contrainer">
<button onclick="fetchData()">Fetch Data</button> <div>
<div id="output"></div> <h1>Server and GPU Information</h1>
<p id="time"></p>
</div>
<div class="checkboxes">
<div class="sample">
<label for="toggle_network">网络</label>
<input type="checkbox" id="toggle_network" checked onchange="updateDisplay()">
</div>
<div class="sample">
<label for="toggle_memory">内存</label>
<input type="checkbox" id="toggle_memory" checked onchange="updateDisplay()">
</div>
<div class="sample">
<label for="toggle_storage">存储</label>
<input type="checkbox" id="toggle_storage" checked onchange="updateDisplay()">
</div>
<div class="sample">
<label for="toggle_gpus">显卡</label>
<input type="checkbox" id="toggle_gpus" checked onchange="updateDisplay()">
</div>
</div>
</div>
<div id="server-data"></div>
<script> <script>
let lastData = null;
// 请求服务器获取GPus数据
function fetchData() { function fetchData() {
fetch('http://lxblxb.top:15002/data') // 发起 GET 请求到 Flask 服务器的 '/get_data' 路径 fetch('http://127.0.0.1:15002/all_data')
// 获取服务器和显卡数据
.then(response => response.json()) // 解析 JSON 响应 .then(response => response.json()) // 解析 JSON 响应
.then(data => { .then(data => {
// 处理 JSON 数据 // 处理 JSON 数据
console.log(data); // console.log(data);
document.getElementById('output').innerHTML = '<pre>' + JSON.stringify(data, null, 2) + '</pre>'; displayServerData(data); // 调用显示数据的函数
}) })
.catch(error => { .catch(error => {
console.error('Error fetching data:', error); // console.error('Error fetching data:', error);
displayError(error + " (多半是没有正确连接服务器端,可能是没开、网络错误)");
}); });
} }
function displayError(err_info){
let serverDataContainer = document.getElementById('server-data');
serverDataContainer.innerHTML = ''; // 清空容器
let errDiv = document.createElement('div');
errDiv.classList.add('error-info');
errDiv.innerHTML = err_info;
serverDataContainer.appendChild(errDiv);
}
function parse_data_unit(num, fixedLen=2){
if (num < 1024){
return num.toFixed(fixedLen) + " KB";
}
num /= 1024;
if (num < 1024){
return num.toFixed(fixedLen) + " MB";
}
num /= 1024;
if (num < 1024){
return num.toFixed(fixedLen) + " GB";
}
num /= 1024;
if (num < 1024){
return num.toFixed(fixedLen) + " TB";
}
}
function add_bar(serverCard){
let bar = document.createElement('hr');
serverCard.appendChild(bar);
}
function updateDisplay(){
if (lastData != null){
displayServerData(lastData);
}
}
// 页面绑定数据
function displayServerData(data) {
lastData = data;
// 绘制 -------------------
let serverDataContainer = document.getElementById('server-data');
serverDataContainer.innerHTML = ''; // 清空容器
let timeStr = data['time']
let serverData = data['server_data']
let timeDiv = document.getElementById('time')
timeDiv.textContent = "更新时间为:" + timeStr
let greenDot = '<span style="color: green;"> 空闲</span>';
let yellowDot = '<span style="color: orange;"> 占用</span>';
let redDot = '<span style="color: red;"> 占用</span>';
for (let key in serverData){
let serverCard = document.createElement('div');
serverCard.classList.add('card');
// 标题
let serverName = document.createElement('div');
serverName.classList.add('server-name');
let updateFlag = serverData[key].updated ? '' : ' - Not updated -';
serverName.textContent = key + updateFlag;
serverCard.appendChild(serverName);
// 网速
if (document.getElementById('toggle_network').checked && 'network_info' in serverData[key]){
let networkInfo = document.createElement('div');
networkInfo.classList.add('network-info');
let inNum = serverData[key].network_info.in;
let outNum = serverData[key].network_info.out;
inNum = parse_data_unit(inNum)
outNum = parse_data_unit(outNum)
networkInfo.innerHTML += "<strong> 网络 : </strong> in: " + inNum + "/s, out: " + outNum + "/s";
serverCard.appendChild(networkInfo);
// 分割线
add_bar(serverCard);
}
// 内存
if (document.getElementById('toggle_memory').checked && 'memory_info' in serverData[key]){
let memoryInfo = document.createElement('div');
memoryInfo.classList.add('memory-info');
let totalNum = serverData[key].memory_info.total
let usedNum = serverData[key].memory_info.used
let totalMem = parse_data_unit(totalNum);
let usedMem = parse_data_unit(usedNum);
let tmpColor = "green";
if (usedNum / totalNum > 0.8)
tmpColor = "red";
else if (usedNum / totalNum > 0.6)
tmpColor = "orange";
memoryInfo.innerHTML += "<strong> 内存 : </strong> <span style=\"color: " + tmpColor + ";\">" + usedMem + " / " + totalMem + "</span><br>";
serverCard.appendChild(memoryInfo);
// 分割线
add_bar(serverCard);
}
// 存储空间
if (document.getElementById('toggle_storage').checked && 'storage_info_list' in serverData[key]){
let storageInfo = document.createElement('div');
storageInfo.classList.add('storage-info');
for (let i = 0; i < serverData[key].storage_info_list.length; i++) {
let targetPath = serverData[key].storage_info_list[i].path;
let totalNum = serverData[key].storage_info_list[i].total
let availableNum = serverData[key].storage_info_list[i].available
let totalStorage = parse_data_unit(totalNum);
let availableStorage = parse_data_unit(totalNum - availableNum);
let tmpColor = "green";
if (availableNum / totalNum < 0.1)
tmpColor = "red";
else if (availableNum / totalNum < 0.3)
tmpColor = "orange";
storageInfo.innerHTML += '<strong>' + targetPath + " :</strong> <span style=\"color: " + tmpColor
+ ";\">" + availableStorage + " / " + totalStorage + "</span><br>";
}
serverCard.appendChild(storageInfo);
// 分割线
add_bar(serverCard);
}
// gpu
if (document.getElementById('toggle_gpus').checked && 'gpu_info_list' in serverData[key]){
serverData[key].gpu_info_list.forEach(function(gpu){
let gpuInfo = document.createElement('div');
gpuInfo.classList.add('gpu-info');
let colorDot = greenDot;
if (gpu.used_mem < 1000 && gpu.util_gpu < 20){
colorDot = greenDot;
}
else if (gpu.util_mem < 50){
colorDot = yellowDot;
}else{
colorDot = redDot;
}
gpuInfo.innerHTML = '<strong>' + gpu.idx + ' - ' + gpu.gpu_name + colorDot + '</strong><br>'
+ '温度: ' + gpu.temperature + '°C<br>'
+ '显存: ' + gpu.used_mem + ' / ' + gpu.total_mem + " MB" + '<br>'
+ '利用率: ' + gpu.util_gpu + '%';
// 添加用户使用信息
if ('users' in gpu) { // 检查是否有用户信息
let userInfo = document.createElement('div');
userInfo.classList.add('user-info');
userInfo.innerHTML = "<strong>使用情况:</strong>";
// for (const [username, mem] of Object.entries(gpu.users)) {
// userInfo.innerHTML += `<span class="user-item">${username} (${mem}) </span>`;
// }
// 排序
const user_entries = Object.entries(gpu.users);
const sorted_user = user_entries.sort((a, b) => b[1] - a[1]);
sorted_user.forEach(([key, value]) => {
// 过滤小于50MB的
if (value > 40)
userInfo.innerHTML += `<span class="user-item">${key} (${value}) </span>`;
});
gpuInfo.appendChild(userInfo); // 将用户信息添加到GPU信息中
}
serverCard.appendChild(gpuInfo);
});
// 分割线
add_bar(serverCard);
}
// 错误信息
if ('err_info' in serverData[key])
{
let errInfo = document.createElement('div');
errInfo.classList.add('error-info');
errInfo.innerHTML = '<strong>error info</strong><br>' + serverData[key].err_info;
serverCard.appendChild(errInfo);
// 分割线
add_bar(serverCard);
}
// 删除最后的分割线
if (serverCard.lastElementChild && serverCard.lastElementChild.tagName === 'HR') {
serverCard.removeChild(serverCard.lastElementChild);
}
serverDataContainer.appendChild(serverCard);
}
}
// 页面加载时获取数据并定时刷新
document.addEventListener('DOMContentLoaded', function() {
fetchData();
setInterval(fetchData, 3000); // 每3秒刷新一次数据
});
</script> </script>
</body> </body>
</html> </html>

BIN
pics/demo.png

Binary file not shown.

Before

Width:  |  Height:  |  Size: 558 KiB

After

Width:  |  Height:  |  Size: 558 KiB

21
serverList_examlpe.json

@ -0,0 +1,21 @@
[
{
"title": "SERVER_233",
"ip": "123.123.123.233",
"port": 12345,
"username": "lxb",
"password": "abcdefg"
},
{
"title": "SERVER_76",
"ip": "123.123.123.76",
"port": 22,
"username": "lxb",
"key_filename": "/home/.ssh/id_rsa",
"network_interface_name": "eno2",
"storage_list": [
"/media/D",
"/media/F"
]
}
]
Loading…
Cancel
Save