Support quantize_ for convolution on XNNPACK

XNNPACK's partitioner and lowering logic doesn't recognize the quantize_ (de)quantize_affine representation in the IR for convolution nodes. We need to wire this up.

Example quantization code:
```py
conv_config = IntxWeightOnlyConfig(
        weight_dtype=torch.int8,
        granularity=PerAxis(0),
    )
    quantize_(
        model,
        conv_config,
        lambda m, fqn: isinstance(m, torch.nn.Conv2d),
    ) 
```

cc @digantdesai @cbilgin

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support quantize_ for convolution on XNNPACK #17432

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support quantize_ for convolution on XNNPACK #17432

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions