Continual Learning (CL) is the problem of sequentially learning predictive models with varying data that may originate from different contexts. Many existing CL methods assume that the data stream is divided into a sequence of contexts, termed as tasks, with explicitly given transition boundaries. Unfortunately, many real-world CL scenarios have neither explicit task information nor context boundaries, motivating the study of task-agnostic CL. This paper proposes a variational architecture growing framework dubbed VariGrow. By interpreting dynamically growing neural networks as a Bayesian approximation, and defining flexible implicit variational distributions, VariGrow detects if a new task is arriving through an energy-based novelty score. If the novelty score is high and the sample is “detected” as a new task, VariGrow will grow a new expert module to be responsible for it. Otherwise, the sample will be assigned to one of the existing experts who is the most “familiar” with it (i.e., one with the lowest novelty score) to preserve all the acquired knowledge. We have tested VariGrow on several CIFAR and ImageNet-based benchmarks for the strictly task-agnostic CL setting without any task information during training or testing, which demonstrates its consistently superior or competitive performance.