avatar

ch_huang

AI Compiler Engineer

Step into ByteIR

Draft Notes for byteir code review [TOC] Frontend Byteir支持三种前端输入,分别是onnx,tf,torch,最终收敛到stablehlo dialect。 frontends/ ├── README.md ├──

C++17 new feature

1. structured binding Binds the specified names to subobjects or elements of the initializer. Like a reference, a structured binding is an alias to an existing object. Unlike a reference, a structured binding does not have to be of a reference type. 类似引用,可以绑定到已经存在的结构化对象的成员,但不一定

How does xla integrate with triton

Brief xla的codegen可以选择使用triton backend 对少量的op进行codegen,包括matmul和softmax。 通过选项xla_gpu_

Inductor code review

compile_fx dynamo 注册 inductor 主入口是 compile_fx 函数,在dynamo中 @register_backend def inductor(*args, **kwargs): # do import here to avoid loading inductor into memory when it is not used from torch._inductor.compile_fx import compile_fx return compile_fx(*args, **kwargs) compile_fx 核心逻辑 函数声明 def compile_fx( model_: torch.fx.GraphModule, example_inputs_: List[torch.Tensor], inner_compile: Callable[..., Any] = compile_fx_inner, config_patches: Optional[Dict[str, Any]]

Analysis triton tutorial matmul L2 cache optimization

Ref to triton tutorial triton compiler 负责CTA内部的线程排布以及内存排布,CTA外部(即如何排布CTA)是由使用者去tune的。这篇triton的教程介绍了如何提高