Binghui OuyanginPyTorchHow We Used AWS Inferentia to Boost PyTorch NLP Model Performance by 4.9xAWS Inferentia is the first ML chip by AWS, which promises to achieve the highest throughput at almost half the cost per inferenceApr 7, 20212Apr 7, 20212