Accelerated Algebra Developed from Tensorflow def calc(x, y, z): return tf.reduce_sum(x + y * z) Optimise compute graph via single kernel launch vs. launching three separate kernel See also PJRT