Quick Start

Development Approach

MaaFramework provides three integration solutions to meet different development scenarios:

Approach 1: Pure JSON Low-Code Programming (General UI)

Applicable Scenarios: Quick start, simple logic implementation

Features:

Zero coding prerequisite
Automated processes configured through JSON
Comes with a 🎞️ video tutorial and a ⭐ project template

jsonc

{
  "Click Start Button": {
    "recognition": "OCR", // Text recognition engine
    "expected": "Start", // Target text
    "action": "Click", // Execute click action
    "next": ["Click Confirm Icon"] // Subsequent task chain
  },
  "Click Confirm Icon": {
    "recognition": "TemplateMatch", // Image template matching
    "template": "confirm.png", // Matching asset path
    "action": "Click"
  }
}

Approach 2: JSON + Custom Logic Extension (Recommended)

💡 Core feature of release v4.x

Features:

Retains the low-code advantage of JSON
Registers custom task modules through AgentServer
Seamlessly integrates with the ⭐ boilerplate.

jsonc

{
  "Click Confirm Icon": {
    "next": ["Custom Processing Module"]
  },
  "Custom Processing Module": {
    "recognition": "Custom",
    "custom_recognition": "MyReco", // Custom recognizer ID
    "action": "Custom",
    "custom_action": "MyAct" // Custom action ID
  }
}

💡 The General UI will automatically connect to your AgentServer child process, and call the corresponding recognition/action when executing MyReco/MyAct.

python

# Python pseudo-code example
from maa.agent.agent_server import AgentServer

# Register a custom recognizer
@AgentServer.custom_recognition("MyReco")
class CustomReco:
    def analyze(ctx):
        return (10, 10, 100, 100)  # Return your own processed recognition result

# Register a custom action
@AgentServer.custom_action("MyAct")
class CustomAction:
    def run(ctx):
        ctx.controller.post_click(100, 10).wait()  # Execute click
        ctx.override_next(["TaskA", "TaskB"])       # Dynamically adjust the task flow

# Start the Agent service
AgentServer.start_up(sock_id)

For a complete example, refer to the template commit.

Approach 3: Full-Code Development

Applicable Scenarios:

Deep customization requirements
Implementation of complex business logic
Need for flexible control over the execution flow

python

# Python pseudo-code example
def main():
    # Execute the predefined JSON task
    result = tasker.post_task("Click Start Button").wait().get()

    if result.completed:
        # Execute code-level operations
        tasker.controller.post_click(100, 100)
    else:
        # Get the current screenshot
        image = tasker.controller.cached_image
        # Register a custom action
        tasker.resource.register_custom_action("MyAction", MyAction())
        # Execute a mixed task chain
        tasker.post_task("Click Confirm Icon").wait()

Resource Preparation

File Structure Specification

⭐If you use the boilerplate, just modify it directly in folder.

tree

my_resource/
├── image/                # Image asset library
│   ├── my_button_ok.png
│   └── my_icon_close.png
├── model/
│   └── ocr/              # Text recognition models
│       ├── det.onnx
│       ├── keys.txt
│       └── rec.onnx
└── pipeline/             # Task pipelines
    ├── my_main.json
    └── my_subflow.json

You can modify the names of files and folders starting with "my_", but the others have fixed file names and should not be changed. Here's a breakdown:

Pipeline JSON Files

The files in my_resource/pipeline contain the main script execution logic and recursively read all JSON format files in the directory.

You can refer to the Task Pipeline Protocol for writing these files. You can find a simple demo for reference.

Tools:

JSON Schema
VSCode Extension
- Config resources based on interface.json
- Support going to task definition, finding task references, renaming task, completing task, click to launch task
- Support launching as MaaPiCli
- Support screencap and crop image after connected

Image Files

The files in my_resource/image are primarily used for template matching images, feature detection images, and other images required by the pipeline. They are read based on the template and other fields specified in the pipeline.

Please note that the images used need to be cropped from the lossless original image and scaled to 720p. UNLESS YOU EXACTLY KNOW HOW MAAFRAMEWORK PROCESSES, DO USE THE CROPPING TOOLS BELOW TO OBTAIN IMAGES.

Text Recognition Model Files

⭐If you use the boilerplate, just follow its documentation and run configure.py to automatically deploy the model file.

The files in my_resource/model/ocr are ONNX models obtained from PaddleOCR after conversion.

You can use our pre-converted files: MaaCommonAssets. Choose the language you need and store them according to the directory structure mentioned above in Prepare Resource Files.

If needed, you can also fine-tune the official pre-trained models of PaddleOCR yourself (please refer to the official PaddleOCR documentation) and convert them to ONNX files for use. You can find conversion commands here.

Debug

Use Development Tool.
Some tools will generate config/maa_option.json file in the same directory, including:
- logging: Save the log and generate debug/maa.log. Default true.
- recording: Save recording function, which will save all screenshots and operation data during operation. You can use DbgController for reproducible debugging. Default false.
- save_draw: Saves the image recognition visualization results. All image recognition visualization results drawings during the run will be saved. Default false.
- show_hit_draw: Displays the node hit pop-up window. Each time the recognition is successful, a pop-up window will display the recognition result. Default false.
- stdout_level: The console displays the log level. The default is 2 (Error), which can be set to 0 to turn off all console logs, or to 7 to turn on all console logs.
If you integrate it yourself, you can enable debugging options through the Toolkit.init_option / MaaToolkitConfigInitOption interface. The generated json file is the same as above.

Run

You can using Generic UI (MaaPiCli, MFA, MFW, etc) or by writing integration code yourself.

Using MaaPiCli

⭐If you use the boilerplate, follow its documentation directly and run install.py to automatically package the relevant files.

Use MaaPiCli in the bin folder of the Release package, and write interface.json and place it in the same directory to use it.

The Cli has completed basic function development, and more functions are being continuously improved! Detailed documentation needs to be further improved. Currently, you can refer to Sample to write it.

Examples:

Writing Integration Code Yourself

Please refer to the Integration Documentation.

Quick Start ​

Development Approach ​

Approach 1: Pure JSON Low-Code Programming (General UI) ​

Approach 2: JSON + Custom Logic Extension (Recommended) ​

Approach 3: Full-Code Development ​

Resource Preparation ​

File Structure Specification ​

Pipeline JSON Files ​

Image Files ​

Text Recognition Model Files ​

Debug ​

Run ​

Using MaaPiCli ​

Writing Integration Code Yourself ​