概述
sc-core 是 sc 的通信组件,提供:
的能力,它是一个 grpc server ,需要建立通道的组件都是 grpc client。
设计目标
问题
- 部署问题:防火墙策略开放不合理;将之前 总控->引擎 这种开放策略改为 引擎->总控。原因详见通信链路相关机制讲解
- 链路繁杂问题:之前组件负责人各自进行开发,没有站在系统角度考虑问题,导致链路过多,通信混乱,增加了复杂度
解决
- 使用 sc-core 作为通用通信组件,将防火墙策略问题解决了,即通信建立的方向为 引擎->总控。
- 客户端连接 sc-core 建立通信通道,将多条链路整合为了一条,降低了系统复杂度。
架构实现
主要原理
- grpc 双向通信 (通道建立)
- TLS (安全认证)
- grpc streamInterceptor

实现
基础库
协议
core_message
core_message 是 sc-core 传输消息的协议
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
| message Metadata { message Info { protobuf.common.v1.Engine engine = 1; protobuf.common.v1.Node node = 2; string component = 3; } Info src = 1; Info dst = 2; }
message CoreMessage { uint64 messageId = 1;
Metadata metadata = 2; oneof message { Request request = 4; Response response = 5; } int64 expiredTime = 6;
string version = 7;
map<string, string> headers = 9;
uint64 Seq = 10; }
|
Request
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| enum Method { GET = 0; POST = 1; PUT = 2; DELETE = 3; PATCH = 4; LIST = 5; }
message Request { string url = 1; string method = 2; bytes body = 3; map<string, string> params = 4; }
|
Response
1 2 3 4 5 6 7 8 9 10
| message Response { enum Code { SUCCESS_CODE = 0; FAIL_CODE = 1; } Code code = 1; bytes body = 2; string error = 3; }
|
service stream
1 2 3
| service CoreService { rpc Agent (stream protobuf.common.v1.CoreMessage) returns (stream protobuf.common.v1.CoreMessage){} }
|
服务端
测试
容器测试
容器测试1
1 2 3
| docker build -t core .
docker run -d -p 20111:20111 --name core core
|
容器测试2
Ubuntu20.04 安装
1 2 3 4 5
| docker build -f test/install/Dockerfile -t ubuntu-sc-core .
docker run -d -p 20111:20111 --privileged --name ubuntu-sc-core -v /sys/fs/cgroup:/sys/fs/cgroup:ro -v /run/dbus/system_bus_socket:/run/dbus/system_bus_socket:ro -v /run/dbus:/run/dbus --cap-add SYS_ADMIN ubuntu-sc-core && docker exec ubuntu-sc-core systemctl start install.service
|
容器测试3
1 2 3 4 5 6 7
| docker build -f test/e2e/Dockerfile -t core-client .
docker run --rm --net=host -e ADDR=127.0.0.1 -e LOOP=500 core-client
docker run --rm -e ADDR=172.16.1.216 -e LOOP=500 core-client
|
Makefile 执行 test
异常梳理
hosts 问题
这个问题特征是sc core 找不到 客户端注册 信息,但是接收到 从客户端来的 request, 所以会返回给 客户端 503 找不到对端, 通过 ss 命令找到连到其他 sc ip 了
在 测试环境容易遇到,客户环境不太可能出现,遇到了可以立马查看 /etc/hosts,或者清理后重试
对端问题
对端退出
1 2 3 4 5 6
| 2024/05/06 11:03:08 core client connection stop 2024/05/06 11:03:08 client handle recv ctx done 2024/05/06 11:03:08 rpc err: rpc error: code = Canceled desc = grpc: the client connection is closing 2024/05/06 11:03:08 core client handle receive error rpc error: code = Canceled desc = grpc: the client connection is closing 2024/05/06 11:03:13 request: request timeout panic: request timeout
|
调试时出现
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
| 2024/05/06 15:59:57 server is currently unavailable: rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 15:59:57 rpc err: rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 15:59:57 core client handle receive error rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 15:59:57 start client server err: rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 15:59:57 core client connection close error: rpc error: code = Canceled desc = grpc: the client connection is closing 2024/05/06 15:59:57 core client connection stop 2024/05/06 15:59:58 server is currently unavailable: rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 15:59:58 rpc err: rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 15:59:58 request: connect interrupted 2024/05/06 15:59:58 core client handle receive error rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 15:59:58 start err: rpc error: code = Unavailable desc = error reading from server: EOF panic: start client server err: rpc error: code = Unavailable desc = error reading from server: EOF
2024/05/06 16:27:29 server is currently unavailable: rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 16:27:30 rpc err: rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 16:27:30 server is currently unavailable: rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 16:27:30 rpc err: rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 16:27:30 request: message send fail:[EOF] 2024/05/06 16:27:30 core client handle receive error rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 16:27:30 start err: rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 16:27:30 server is currently unavailable: rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 16:27:30 core client handle receive error rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 16:27:30 start client server err: rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 16:27:30 rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 16:27:30 server is currently unavailable: rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 16:27:33 message send fail:[EOF] 2024/05/06 16:27:35 request chan delete user/info 2024/05/06 16:27:35 request: message send fail:[EOF] 2024/05/06 16:27:37 connect err: context deadline exceeded 2024/05/06 16:27:37 message send fail:[EOF] panic: context deadline exceeded
2024/05/06 16:23:32 server is currently unavailable: rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 16:23:32 rpc err: rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 16:23:32 core client handle receive error rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 16:23:32 start err: rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 16:23:32 server is currently unavailable: rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 16:23:32 server is currently unavailable: rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 16:23:32 rpc err: rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 16:23:32 core client handle receive error rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 16:23:32 start err: rpc error: code = Unavailable desc = error reading from server: EOF 2024/05/06 16:23:33 server is currently unavailable: rpc error: code = Unavailable desc = error reading from server: EOF panic: rpc error: code = Unavailable desc = error reading from server: EOF
|
未知问题
1
| rpc error: code = Unavailable desc = error reading from server: EOF
|
1
| dlv --listen=:2345 --headless=true --api-version=2 --accept-multiclient exec ./client dlv --listen=:2345 --headless=true --api-version=2 --accept-multiclient exec ./client -case base.simple -addr 127.0.0.1 -loop 100
|
1 2 3 4 5 6
| # 调试构建 go build -gcflags "all=-N -l" -o client cmd/client/main.go
# 调试运行 dlv --listen=:2345 --headless=true --api-version=2 --accept-multiclient exec ./client -- -case base.simple -addr 127.0.0.1 -loop 100
|