Collect all the performance hits
or don't eat to much (syntactic) Sugar!
As mentioned previously in 2025-02-11_laravels-magic-and-its-performance-costs #4 Laravel’s collections are powerful, but misusing them can lead to growing performance losses. Back then I mentioned the pitfall that is method chaining (which is awesome and I love it). But under the Hood, it results in many many loops. But that's not the deepest part of the Pit.
Today I found a nice and clean piece of Code (which I had written myself a couple of years ago. Hats off to me) which basically worked with SVG Paths and Points along those Paths.
One of the responsible Classes had a singular property
class MyClass {
public function __construct(
private \Illuminate\Supoort\Colletion $path
) {
}
}
which basically held a lot of these
class Point {
public function __construct(
public float $x,
public float $y,
) {
}
}
and did a bunch of that
$this->path->where('x', '<', $someWhere);
$this->path->filter(...)->sum('x');
$this->path->filter(...)->max('x');
$this->path->filter(...)->min('x');
Looks super clean. Easy to read. Easy to understand.
BUT
But there was a reason I was looking at that class again. For some reason we spent a lot of processing time here. I know the Collection class, I know the EnumeratesValues trait. So what's the Problem? The Problem is, we ran into ginormous paths, and all of a sudden, thousands of iterations turn into millions and milliseconds turn into minutes. Now I took a closer look, how does this magic trick work.
/**
* Filter items by the given key value pair.
*
* @param callable|string $key
* @param mixed $operator
* @param mixed $value
* @return static
*/
public function where($key, $operator = null, $value = null)
{
return $this->filter($this->operatorForWhere(...func_get_args()));
}
Doesn't look to bad, so it has to be inside operatorForWhere
/**
* Get an operator checker callback.
*
* @param callable|string $key
* @param string|null $operator
* @param mixed $value
* @return \Closure
*/
protected function operatorForWhere($key, $operator = null, $value = null)
{
if ($this->useAsCallable($key)) {
return $key;
}
if (func_num_args() === 1) {
$value = true;
$operator = '=';
}
if (func_num_args() === 2) {
$value = $operator;
$operator = '=';
}
return function ($item) use ($key, $operator, $value) {
$retrieved = data_get($item, $key);
$strings = array_filter([$retrieved, $value], function ($value) {
return is_string($value) || (is_object($value) && method_exists($value, '__toString'));
});
if (count($strings) < 2 && count(array_filter([$retrieved, $value], 'is_object')) == 1) {
return in_array($operator, ['!=', '<>', '!==']);
}
switch ($operator) {
default:
case '=':
case '==': return $retrieved == $value;
case '!=':
case '<>': return $retrieved != $value;
case '<': return $retrieved < $value;
case '>': return $retrieved > $value;
case '<=': return $retrieved <= $value;
case '>=': return $retrieved >= $value;
case '===': return $retrieved === $value;
case '!==': return $retrieved !== $value;
case '<=>': return $retrieved <=> $value;
}
};
}
Couldn't be the lines before the returned \Closure as they don't get executed more often with more Entries in the
Collection, so let's skip over those.
Actually no, lets not, why is there no else if? we always get two func_num_args calls… we always get two
comparisons… even though we don't need it, why? Is this for so-called "readability"?
I don't like it.
Oh! And by the way, here is some Benchmarks for a Single Path with around 3000000 Points:
| Function | Time (ms) |
|---|---|
$this->path->where('x', '<', $someWhere) |
18219 |
$this->path->filter(static fn(Point $p) => $p->x < $someWhere) |
1633 |
$this->path->max('x') |
15278 |
$this->path->reduce(static fn(float $result, Point $p) => $p->x > $result ? $p->x : $result, PHP_INT_MIN) |
3979 |
$this->path->sum('x'); |
11713 |
$this->path->reduce(static fn(float $result, Point $p) => $result + $p->x, 0) |
2329 |
Now, lets take a look at the \Closure and look who it is, data_get probably the Culprit. It's always the Culprit.
Let's replace my ->where(... calls with the anonymous Closure provided by operatorForWhere and remove that stinking
data_get call and now the modified \Closure "only" took 8283ms. That's about ten seconds faster already!
Since we're working with float's we can just remove the next few lines as well, and now we're at 2171ms which is
almost as fast as the filter call.
We could now also remove the switch, but that's pointless.
What does that tell us?
It's totally fine to use those methods. I will continue to use these methods. But after implementing, after writing
tests, I will (try to remember) to refactor those instances of ->where into ->filter calls. The sum, and max
methods have the same Issue, they utilize data_get and max even filters the collection beforehand. We use arrays now,
but I suggest extending the Collection Class for your use-case and implementing your logic there, as vanilla as it gets.
I've been using Laravel for 8 years now, the Application I work on has grown a lot in that time, the amount of concurrent Users and the Amount of Data the Users produce and consume has also Increased with the feature-creep that naturally happens. When we initially migrated to Laravel we were really hyped about all the syntactic sugar and helper methods. But now we have removed almost all the Illuminate stuff from the Core of our Application, and we're probably going to remove more.