<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Architecture on Modelplane Docs</title><link>https://docs.modelplane.ai/architecture/</link><description>Recent content in Architecture on Modelplane Docs</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Mon, 01 Jan 0001 00:00:00 +0000</lastBuildDate><atom:link href="https://docs.modelplane.ai/architecture/index.xml" rel="self" type="application/rss+xml"/><item><title>How it schedules</title><link>https://docs.modelplane.ai/architecture/scheduling/</link><pubDate/><guid>https://docs.modelplane.ai/architecture/scheduling/</guid><description>&lt;p&gt;&lt;strong&gt;API:&lt;/strong&gt; &lt;a href="https://docs.modelplane.ai/reference/modeldeployments/"&gt;&lt;code&gt;modelplane.ai/v1alpha1&lt;/code&gt; · ModelDeployment&lt;/a&gt;&lt;/p&gt;
&lt;!-- vale write-good.Passive = NO --&gt;
&lt;p&gt;When an ML team creates a &lt;a href="https://docs.modelplane.ai/models/model-deployment/"&gt;ModelDeployment&lt;/a&gt;,
the fleet scheduler decides which cluster each replica runs on and which node
pool each engine uses. Platform teams don&amp;rsquo;t drive it directly, but what they
publish, the clusters, their labels, and each pool&amp;rsquo;s
&lt;a href="https://docs.modelplane.ai/platform/inference-class/"&gt;InferenceClass&lt;/a&gt;, is exactly what the
scheduler matches against. This page explains how it places work and where it
deliberately stops short, so you can reason about why a deployment landed where it
did.&lt;/p&gt;</description></item></channel></rss>