stubpy.ast_pass¶
stubpy.ast_pass¶
AST pre-pass — harvests structural metadata from source without executing the module.
This module runs a read-only walk over the source file’s AST before (or instead of) importing the module. Because no code is executed, this pass is free from import-time side effects.
The harvested data is stored in ASTSymbols and fed into
build_symbol_table() to construct the
SymbolTable.
What is harvested¶
Classes — name, source line, base class expressions (as strings), decorator names, and directly-defined methods.
Module-level functions — name, line,
asyncflag, decorator names, and a flag for@overload-decorated variants.Annotated variables —
name: Type = valueat module scope.``__all__`` — the explicit public API list, when present.
Type alias declarations (all forms):
Name: TypeAlias = <rhs>— explicit PEP 613 annotationName = int | float— bare PEP 604 unionName = Union[str, int]— subscripted genericName = int— known built-in or typing type nametype Name = <rhs>— Python 3.12+ PEP 695 soft keywordtype Stack[T] = list[T]— generic alias (PEP 695)
TypeVar / ParamSpec / TypeVarTuple / NewType call-expression declarations.
Ignore directive¶
If the source file begins (before any code) with a comment containing
# stubpy: ignore (case-insensitive), the harvester returns an empty
ASTSymbols and the caller should skip stub generation for that
file. Check ASTSymbols.skip_file to detect this.
What is not harvested¶
Nested functions or classes inside other functions.
Import statements (handled by
stubpy.imports).Runtime values — those require the module to be executed.
Examples
>>> from stubpy.ast_pass import ast_harvest
>>> syms = ast_harvest("x: int = 1\nclass Foo: pass\n")
>>> syms.variables[0].name
'x'
>>> syms.classes[0].name
'Foo'
- ast_harvest(source: str) ASTSymbols[source]¶
Parse source and return structural metadata without executing any code.
This is the main entry point for the AST pre-pass stage. A fresh
ASTHarvesteris created for each call, making this function fully re-entrant.- Parameters:
source (str) – Raw Python source text.
- Returns:
ASTSymbols – Populated container of all harvested metadata. On a
SyntaxErrorthe result will be empty but valid — no exception is raised.
Examples
>>> syms = ast_harvest("") >>> syms.classes [] >>> syms = ast_harvest("class Foo(Bar): pass") >>> syms.classes[0].name, syms.classes[0].bases ('Foo', ['Bar']) >>> syms = ast_harvest("async def fetch(url: str) -> None: ...") >>> fn = syms.functions[0] >>> fn.is_async, fn.name (True, 'fetch') >>> syms = ast_harvest("X = TypeVar('X')") >>> syms.typevar_decls[0].kind 'TypeVar'
- class ASTSymbols(classes: list[ClassInfo] = <factory>, functions: list[FunctionInfo] = <factory>, variables: list[VariableInfo] = <factory>, typevar_decls: list[TypeVarInfo] = <factory>, all_exports: list[str] | None = None, skip_file: bool = False)[source]¶
Bases:
objectContainer for all metadata harvested from a single source file’s AST.
Created by
ast_harvest()and consumed bybuild_symbol_table().- functions¶
All top-level function definitions, in source order.
- Type:
list of FunctionInfo
- variables¶
All top-level annotated (and plain) variable assignments.
- Type:
list of VariableInfo
- typevar_decls¶
TypeVar / ParamSpec / TypeVarTuple / TypeAlias / NewType declarations.
- Type:
list of TypeVarInfo
- all_exports¶
Contents of
__all__, orNonewhen the module has no__all__declaration.
- functions: list[FunctionInfo]¶
- variables: list[VariableInfo]¶
- typevar_decls: list[TypeVarInfo]¶
- __init__(classes: list[ClassInfo] = <factory>, functions: list[FunctionInfo] = <factory>, variables: list[VariableInfo] = <factory>, typevar_decls: list[TypeVarInfo] = <factory>, all_exports: list[str] | None = None, skip_file: bool = False) None¶
- class ASTHarvester(source: str)[source]¶
Bases:
NodeVisitorWalk the top-level AST of a Python source file and collect structural metadata without executing any code.
Only top-level definitions are collected (class/function/variable statements that are direct children of the module body). Statements nested inside
if,with, ortryblocks at the module level are visited transitively so that patterns likeif TYPE_CHECKING: ...are still partially harvested.- Parameters:
source (str) – Raw Python source text.
Examples
>>> h = ASTHarvester("async def foo(): pass") >>> syms = h.harvest() >>> syms.functions[0].is_async True
- harvest() ASTSymbols[source]¶
Parse the source and return the populated
ASTSymbols.Returns an empty (but valid)
ASTSymbolsonSyntaxErrorwithout raising.If the source begins (before any code) with a
# stubpy: ignorecomment,skip_fileis set toTrueand the returnedASTSymbolsis otherwise empty.
- visit_ClassDef(node: ClassDef) None[source]¶
Harvest a class definition and its directly-defined methods.
- visit_FunctionDef(node: FunctionDef) None[source]¶
Harvest a top-level synchronous function.
- visit_AsyncFunctionDef(node: AsyncFunctionDef) None[source]¶
Harvest a top-level asynchronous function.
- visit_Assign(node: Assign) None[source]¶
Handle:
__all__ = [...]— populatesall_exports.X = TypeVar(...)/X = NewType(...)— explicit TypeVar declarations.X = int | float/X = Union[int, str]— implicit TypeAlias (bare union or subscripted generic RHS without an annotation).Plain
name = valueassignments — recorded asVariableInfo.
- visit_TypeAlias(node: AST) None[source]¶
Handle Python 3.12+
type Name = ...soft-keyword statement (PEP 695).The AST node is
ast.TypeAlias(available from Python 3.12). We access fields by attribute so the code compiles on Python 3.10/3.11 where the class does not exist but the method will never be called.Examples
The following source:
type Vector = list[float]
produces a
TypeVarInfowithkind="TypeAlias"andsource_str="list[float"].
- visit_AnnAssign(node: AnnAssign) None[source]¶
Handle annotated assignments: *
name: TypeAlias = int | str→TypeVarInfo*name: Type = value→VariableInfo
Data containers
- class FunctionInfo(name: str, lineno: int, is_async: bool = False, decorators: list[str] = <factory>, is_overload: bool = False, raw_arg_annotations: dict[str, str]=<factory>, raw_return_annotation: str | None = None, kwargs_forwarded_to: list[str] = <factory>, args_forwarded_to: list[str] = <factory>)[source]¶
Bases:
objectMetadata for a single function or method definition from the AST.
- Parameters:
name (str)
lineno (int)
is_async (bool) –
Trueforasync defdefinitions.decorators (list of str) – Plain names of all decorators (e.g.
["classmethod"]).is_overload (bool) –
Truewhenoverloadappears in decorators.raw_arg_annotations (dict) – Maps parameter name → unparsed annotation string for every annotated parameter. Variadic names are prefixed:
"*args","**kwargs".raw_return_annotation (str or None) – Unparsed return-annotation string, or
Nonewhen absent.kwargs_forwarded_to (list of str) – Names of callables to which
**kwargsis forwarded in the body. Populated by the body scanner inASTHarvester._harvest_function(). Used byresolve_function_params()to expand variadic parameters into their concrete counterparts.args_forwarded_to (list of str) – Names of callables to which
*argsis forwarded in the body. Same purpose as kwargs_forwarded_to for positional variadics.
Examples
>>> info = FunctionInfo(name="greet", lineno=5, is_async=False) >>> info.is_overload False >>> info.kwargs_forwarded_to []
- class ClassInfo(name: str, lineno: int, bases: list[str] = <factory>, decorators: list[str] = <factory>, methods: list[FunctionInfo] = <factory>)[source]¶
Bases:
objectMetadata for a single class definition from the AST.
- Parameters:
Examples
>>> info = ClassInfo(name="Widget", lineno=10, bases=["Element"]) >>> info.decorators []
- methods: list[FunctionInfo]¶
- class VariableInfo(name: str, lineno: int, annotation_str: str | None = None, value_repr: str | None = None)[source]¶
Bases:
objectMetadata for a module-level variable assignment.
Covers both annotated assignments (
name: Type = value) and plain assignments without annotations (name = value).- Parameters:
- class TypeVarInfo(name: str, lineno: int, kind: str, source_str: str)[source]¶
Bases:
objectMetadata for a
TypeVar,ParamSpec,TypeVarTuple,TypeAlias, orNewTypedeclaration.- Parameters:
Variadic forwarding detection
_harvest_function() walks every function body and records
call sites where the function’s own **kwargs or *args is spread into
another callable. Results are stored in two fields on FunctionInfo:
kwargs_forwarded_to— callable names that receive**kwargs.args_forwarded_to— callable names that receive*args.
These lists are consumed at emission time by
resolve_function_params(). The scan runs for both
top-level functions and class methods (including @classmethod bodies where
the cls(...) forwarding pattern is detected as the special "cls" target).
Type alias detection
The harvester recognises type alias declarations in four forms:
Explicit annotation —
Name: TypeAlias = <rhs>PEP 604 bare union —
Name = int | floatSubscripted generic —
Name = Union[str, int],Name = list[int]Known type name —
Name = int,Name = str,Name = AnyPython 3.12+ PEP 695 —
type Name = <rhs>,type Stack[T] = list[T]
All five forms are stored as TypeVarInfo with kind="TypeAlias"
and emitted via generate_alias_stub().
Assignments where the RHS is an arbitrary user-defined name
(MyAlias = SomeClass) are not promoted — the harvester cannot
determine at parse time whether SomeClass is a type or a runtime value.
Use MyAlias: TypeAlias = SomeClass for unambiguous declaration.
The # stubpy: ignore directive
A source file that begins with # stubpy: ignore (case-insensitive,
before any code) will have skip_file set to True.
The generator detects this and skips emission, writing only a minimal
stub. Subsequent comments and blank lines before the first code statement
are also accepted:
# Auto-generated file — do not stub.
# stubpy: ignore
...