The pursuit of artificial general intelligence (AGI) continuously demands higher computing performance. Despite the superior processing speed and efficiency of integrated photonic circuits, their capacity and scalability are restricted by unavoidable errors, such that only simple tasks and shallow models are realized. To support modern AGIs, we designed Taichi-large-scale photonic chiplets based on an integrated diffractive-interference hybrid design and a general distributed computing architecture that has millions-of-neurons capability with 160-tera-operations per second per watt (TOPS/W) energy efficiency. Taichi experimentally achieved on-chip 1000-category-level classification (testing at 91.89% accuracy in the 1623-category Omniglot dataset) and high-fidelity artificial intelligence-generated content with up to two orders of magnitude of improvement in efficiency. Taichi paves the way for large-scale photonic computing and advanced tasks, further exploiting the flexibility and potential of photonics for modern AGI.