๐งช Tutorial: Urban Mapper + AutoDDG + Jupyter Pipeline¶
๐งช Tutorial: Urban Mapper + AutoDDG +Jupyter Pipeline¶
This tutorial shows how to stack three MCPs:
- Urban Mapper (urban computing analysis utilising the Urban Mapper official library)
- AutoDDG (dataset description & context generation via Large Language Models)
- Jupyter (reproducible notebook analysis)
About Urban Mapper:
- Urban Mapper official repository: https://github.com/VIDA-NYU/UrbanMapper
- Urban Mapper documentation: https://urbanmapper.readthedocs.io/en/latest/
- Urban Mapper MCP: https://github.com/MCP-Pipeline/mcpstack-urbanmapper
About AutoDDG:
- AutoDDG official repository: https://github.com/VIDA-NYU/AutoDDG
- AutoDDG MCP: https://github.com/MCP-Pipeline/mcpstack-autoddg
Youโll learn how to:
- Build a pipeline with the three tools
- Ask in a natural language to build a reproducible urban analysis workflow utilising Urban Mapper, and let the LLM explore the dataset with AutoDDG to get context on it and better inform the analysis
- Export code and results into a Jupyter Notebook for reproducible Python analysis
๐ฅ Video Walkthrough¶
Prerequisites¶
# If you prefer other Python package managers, feel free to adapt `pip install X`.
uv init --python 3.10
uv add mcpstack
uv add mcpstack-jupyter
uv add mcpstack-urbanmapper
uv add mcpstack-autoddg
# To see if the tools are all connected
uv run mcpstack list-tools
๐ง Step 1 โ Build with Pipeline W/ Urban Mapper Default¶
Urban Mapper Default is basically using HuggingFace's datasets to load datasets for urban pipeline analysis. You can control otherwise, but it would be preferable to start with the default.
๐ง Step 2 โ Create a Jupyter ToolConfig¶
Basically, Jupyter MCP works with some sort of connections between the LLM and the Jupyter instance. This is via a
URL and a TOKEN. Hence, the need for a ToolConfig.
uv run mcpstack tools jupyter configure \
--token YOUR_JUPYTER_TOKEN
# This create a `jupyter_config.json` file
# Ex of a token: 1117bf468693444a5608e882ab3b55d511f354a175a0df02
๐ง Step 3 โ Add To The Tool To The Pipeline¶
๐ง Step 4 โ Add AutoDDG To The Pipeline¶
This will prompt an interactive shell to set up at the very least one mandatory environment variable: AUTO_DDG_OPENAI_API_KEY.
This is required to let the LLM use OpenAI large language models. TO generate one key,
please refer to: https://platform.openai.com/account/api-keys.
๐ง Step 5 โ Add To The Tool To The Pipeline¶
๐ง Step 6 โ Compose & Run the Pipeline On Claude Desktop¶
Now you can ask the LLM to explore a dataset, operate a Urban Mapper's pipeline analysis and export results into Jupyter.
๐ฃ Prompt Used During The Demo Video¶
Initial Prompt¶
Hi there! I am interested in a *per-street* `Urban Mapper (UM)` *pipeline* *analysis* in **Downtown Brooklyn, NYC, USA** using the `oscur/nyc_311` dataset showing broadly speaking complaints in the street of New York City. Such dataset can be obtained via via `UM` actions.
What I am genuinely interested into first is to explore the dataset's context & description, please, using `AutoDDG` actions. Do not add any manual analysis of the sort.
**Shall we? Additionally, do download the full **`oscur/nyc_311`** data, but stream/sample the dataset up to 30k rows maximum when using **`AutoDDG`** . **
Note, if you have to download any packages, do so via `!uv add <package_name>`
Follow-up Prompt¶
So prior even building a UM pipeline analysis with nyc_311, can you explore what is `UM`, examples about it, and then based on what you've seen from `UM` and what you have gathered from AutoDDG on the dataset, can you list out here not in the notebook, the various `enrichers` you believe would be interesting to extract out of the dataset to apply to the streets of Downtown brooklyn please? Be straighforward-thinking.
Attaching a bit more documentation about enrichers.
Follow-up Prompt โ Optional¶
nunique is not available, and mean on categorical -based feature will most probably crash.
You have to pass a custom lambda method function to deal with all that, very easy straightforward custom lambda functions.
Follow-up Prompt¶
Let's build the pipeline with the FULL data, all those very interesting stacked enrichers
(
not need to create multiple time the pipeline, stack the enrichers,
and then straight after building, .preview(.), .composed_transform(.)
(by the way, no. need to grave the output of this function for the time being),
and most importantly, follow by nothing else than of : .visualise([<with the name of all of our output column created>]
).
Shall we :) ? Output everything in the jupyter notebook.
Recall for any packages needed `!uv add <package_name>`.
Tip
Try chaining additional tools to build research-ready urban analysis (e.g. ML) workflows.