yuanzu commited on
Commit
752a83d
·
verified ·
1 Parent(s): 1842c56

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -11,7 +11,7 @@ The INT8 data type is both friendly and efficient for most hardware platforms.
11
 
12
  In benchmarking, we observe **no accuracy loss** and up to **33\%** performance enhancement.
13
 
14
- [SGLang](https://github.com/sgl-project/sglang/tree/main) will soon support the block-wise INT8 quantization operation once our [PULL REQUEST](https://github.com/sgl-project/sglang/pull/3730) is merged.
15
 
16
  ## 1. Benchmarking Result (detailed in [PULL REQUEST](https://github.com/sgl-project/sglang/pull/3730)):
17
  | Model | Config | Accuracy (GSM8K) | Accuracy (MMLU) | Output Throughput(qps=128) |
 
11
 
12
  In benchmarking, we observe **no accuracy loss** and up to **33\%** performance enhancement.
13
 
14
+ Thanks to our merged [PULL REQUEST](https://github.com/sgl-project/sglang/pull/3730), [SGLang](https://github.com/sgl-project/sglang/tree/main) is now support the block-wise INT8 quantization operation.
15
 
16
  ## 1. Benchmarking Result (detailed in [PULL REQUEST](https://github.com/sgl-project/sglang/pull/3730)):
17
  | Model | Config | Accuracy (GSM8K) | Accuracy (MMLU) | Output Throughput(qps=128) |