At Cameyo, our load balancing system is designed to deliver the best possible user experience through stable, performant, and reliable application sessions. This article explains how Cameyo intelligently manages server resources in cloud environments and addresses why you might observe server provisioning even when current CPU/RAM utilization appears low.

Prioritizing Stability and Performance Over Strict Vertical Scaling

Unlike traditional on-premises environments, cloud elasticity allows for dynamic scaling to meet user demand. Cameyo leverages this to ensure optimal performance. You might expect new servers to only appear when existing ones are at configured capacity. However, Cameyo's approach prioritizes:

  • Preventing Performance Degradation: Waiting until a server is fully utilized before adding another can lead to slowdowns and unresponsiveness during peak usage or sudden resource spikes (application launches, heavy tasks).

  • Maintaining Responsiveness: By keeping a reserve of resources available, Cameyo ensures that new sessions can start quickly and existing sessions remain smooth, even during fluctuating demand.

  • Avoiding Service Disruptions: Running servers at 100% increases the risk of application crashes and session disconnects, leading to a negative user experience and potential data loss.

The Role of Proactive Scaling

Cameyo's load balancing algorithm considers several factors beyond just the immediate CPU/RAM usage, including:

  • Number of Active Sessions: A higher number of users on a server increases the likelihood of resource contention.

  • Anticipated Load: Cameyo predicts potential future load based on historical patterns and the number of pending session requests.

  • Ensuring Capacity for New Sessions: When new users attempt to connect, Cameyo needs to ensure sufficient resources are available to provide them with a good initial experience.


Understanding "Elasticity Burst"

To further safeguard against service disruptions, Cameyo employs an "Elasticity Burst" mechanism. If the system detects that existing servers are nearing their recommended capacity and new sessions are being requested, it may temporarily exceed your configured "maximum number of instances" to launch additional servers.

  • Purpose: This is a critical safety feature to prevent new sessions from overwhelming already stressed servers, which would negatively impact all users.

  • Trigger: The "burst" occurs when Cameyo determines that adhering strictly to the maximum instance limit would lead to a degraded or failed experience for new users.

  • Outcome: While it might temporarily increase the number of active servers, it ensures the stability and functionality of your overall environment.

Important Note on "Vertical/Horizontal" Load Balancing Setting (for Cloud Environments)

You might have noticed a "Vertical/Horizontal" load balancing setting in the Cameyo interface. For cloud-based deployments, this setting is less directly applicable. Cameyo's underlying load balancing algorithm is optimized for cloud elasticity to distribute load effectively for stability and performance, regardless of this setting. We recognize that this setting can be misleading in cloud environments and are reviewing its clarity in future updates.

Why You Might See Servers Provisioned with Minimal Load

The behavior you're observing – servers being provisioned even with seemingly low current utilization despite the LBFACTORS PowerTag values – is a result of Cameyo proactively ensuring sufficient capacity to handle potential load increases and new session requests without compromising stability. It's a key aspect of providing a reliable and performant application delivery platform.

Our Commitment to Optimal Performance and Stability

Cameyo's load balancing is designed based on years of experience managing numerous cloud deployments. Our priority is to provide a consistent and high-quality user experience. While we understand the desire for cost optimization, strictly limiting server resources can negatively impact the stability and performance of your application delivery.

If you have specific concerns or observe behavior that you believe is not aligned with this explanation, please provide detailed monitoring data (CPU/RAM utilization over time, number of active sessions, timestamps of server provisioning) to our support team for further investigation.