don't want to hit the db again

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

don't want to hit the db again

Kevin Karabian
Hi,  

Is there a way to store already retrieved related objects (related as a has_many) in a result object, such that calling the accessor for that relationship data will not hit the db again. Basically something akin to $_->{_relationship_data}{$relationship_name} = array of result objects.

Thanks,
KK

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Reply | Threaded
Open this post in threaded view
|

Re: don't want to hit the db again

Dagfinn Ilmari Mannsåker
Kevin Karabian <[hidden email]> writes:

> Hi,
>
> Is there a way to store already retrieved related objects (related as a
> has_many) in a result object, such that calling the accessor for that
> relationship data will not hit the db again.

This is exactly what prefetch is for.

https://metacpan.org/pod/DBIx::Class::ResultSet#prefetch
https://metacpan.org/pod/DBIx::Class::ResultSet#PREFETCHING

--
- Twitter seems more influential [than blogs] in the 'gets reported in
  the mainstream press' sense at least.               - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
  to a mainstream media article.                      - Calle Dybedahl


_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Reply | Threaded
Open this post in threaded view
|

Re: don't want to hit the db again

Kevin Karabian
Yes, but the problem is for a has_many, prefetch causes multiple rows to be returned.  For example if a result object has 10 related objects then 10 rows are returned for that one object.  If the result object is large, then that is a lot of repetition.  I want to basically run a query that will return only the 10 related objects and then put those into the result object eliminating the repetition.  

And actually my use case is that I want to do this for a large number or result objects so I am amortizing the cost by grabbing all the related objects for all the result objects in question.  So say, I have 10 objects and each has 10 related objects.  I get the 100 related objects at once and populate the result objects with the relationship data.

On Tue, Apr 28, 2015 at 3:27 PM, Dagfinn Ilmari Mannsåker <[hidden email]> wrote:
Kevin Karabian <[hidden email]> writes:

> Hi,
>
> Is there a way to store already retrieved related objects (related as a
> has_many) in a result object, such that calling the accessor for that
> relationship data will not hit the db again.

This is exactly what prefetch is for.

https://metacpan.org/pod/DBIx::Class::ResultSet#prefetch
https://metacpan.org/pod/DBIx::Class::ResultSet#PREFETCHING

--
- Twitter seems more influential [than blogs] in the 'gets reported in
  the mainstream press' sense at least.               - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
  to a mainstream media article.                      - Calle Dybedahl


_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...



--
Kevin Karabian
Senior Engineer
Turnitin – www.turnitin.com
kkarabian[hidden email]
510.764.7529

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Reply | Threaded
Open this post in threaded view
|

Re: don't want to hit the db again

Len Jaffe
Prefetch generates SQL joins. The join is the FUNDAMENTAL unit of work in a relational database. In 25 years, the only time I've ever seen a join not be faster than decomposing into multiple queries is when the queries involve millions of rows in one or more tables, and/or when the RDBMs query optimizer ran into a pathological case, or a bug.

Try an experiment.  Benchmark the query using a column-specified prefetch, and then benchmark running the first query, extracting the list of foreign keys, and runnign a second query to return all of those. I'd like to see the difference.

I predict that for 100 records, the difference in execution time will be below the threshold where it makes any sense to optimize the join away.

Len.




On Tue, Apr 28, 2015 at 6:37 PM, Kevin Karabian <[hidden email]> wrote:
Yes, but the problem is for a has_many, prefetch causes multiple rows to be returned.  For example if a result object has 10 related objects then 10 rows are returned for that one object.  If the result object is large, then that is a lot of repetition.  I want to basically run a query that will return only the 10 related objects and then put those into the result object eliminating the repetition.  

And actually my use case is that I want to do this for a large number or result objects so I am amortizing the cost by grabbing all the related objects for all the result objects in question.  So say, I have 10 objects and each has 10 related objects.  I get the 100 related objects at once and populate the result objects with the relationship data.

On Tue, Apr 28, 2015 at 3:27 PM, Dagfinn Ilmari Mannsåker <[hidden email]> wrote:
Kevin Karabian <[hidden email]> writes:

> Hi,
>
> Is there a way to store already retrieved related objects (related as a
> has_many) in a result object, such that calling the accessor for that
> relationship data will not hit the db again.

This is exactly what prefetch is for.

https://metacpan.org/pod/DBIx::Class::ResultSet#prefetch
https://metacpan.org/pod/DBIx::Class::ResultSet#PREFETCHING

--
- Twitter seems more influential [than blogs] in the 'gets reported in
  the mainstream press' sense at least.               - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
  to a mainstream media article.                      - Calle Dybedahl


_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...



--
Kevin Karabian
Senior Engineer
Turnitin – www.turnitin.com
kkarabian[hidden email]
<a href="tel:510.764.7529" value="+15107647529" target="_blank">510.764.7529

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...



--
Len Jaffe - Information Technology Smoke Jumper - [hidden email] 
614-404-4214    @LenJaffe  www.lenjaffe.com
Curator of Advent Planet - An Aggregation of Online Advent Calendars.


_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Reply | Threaded
Open this post in threaded view
|

Re: don't want to hit the db again

Kevin Karabian
It is not about execution time, but data size.  The dba's don't like it when I tell them I am pulling x times more data than before.  

I am not against prefetching.  In fact, I had it as a prefetch, but, they balked about the potential increase in data returned.

On Tue, Apr 28, 2015 at 3:51 PM, Len Jaffe <[hidden email]> wrote:
Prefetch generates SQL joins. The join is the FUNDAMENTAL unit of work in a relational database. In 25 years, the only time I've ever seen a join not be faster than decomposing into multiple queries is when the queries involve millions of rows in one or more tables, and/or when the RDBMs query optimizer ran into a pathological case, or a bug.

Try an experiment.  Benchmark the query using a column-specified prefetch, and then benchmark running the first query, extracting the list of foreign keys, and runnign a second query to return all of those. I'd like to see the difference.

I predict that for 100 records, the difference in execution time will be below the threshold where it makes any sense to optimize the join away.

Len.




On Tue, Apr 28, 2015 at 6:37 PM, Kevin Karabian <[hidden email]> wrote:
Yes, but the problem is for a has_many, prefetch causes multiple rows to be returned.  For example if a result object has 10 related objects then 10 rows are returned for that one object.  If the result object is large, then that is a lot of repetition.  I want to basically run a query that will return only the 10 related objects and then put those into the result object eliminating the repetition.  

And actually my use case is that I want to do this for a large number or result objects so I am amortizing the cost by grabbing all the related objects for all the result objects in question.  So say, I have 10 objects and each has 10 related objects.  I get the 100 related objects at once and populate the result objects with the relationship data.

On Tue, Apr 28, 2015 at 3:27 PM, Dagfinn Ilmari Mannsåker <[hidden email]> wrote:
Kevin Karabian <[hidden email]> writes:

> Hi,
>
> Is there a way to store already retrieved related objects (related as a
> has_many) in a result object, such that calling the accessor for that
> relationship data will not hit the db again.

This is exactly what prefetch is for.

https://metacpan.org/pod/DBIx::Class::ResultSet#prefetch
https://metacpan.org/pod/DBIx::Class::ResultSet#PREFETCHING

--
- Twitter seems more influential [than blogs] in the 'gets reported in
  the mainstream press' sense at least.               - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
  to a mainstream media article.                      - Calle Dybedahl


_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...



--
Kevin Karabian
Senior Engineer
Turnitin – www.turnitin.com
kkarabian[hidden email]
<a href="tel:510.764.7529" value="+15107647529" target="_blank">510.764.7529

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...



--
Len Jaffe - Information Technology Smoke Jumper - [hidden email] 
<a href="tel:614-404-4214" value="+16144044214" target="_blank">614-404-4214    @LenJaffe  www.lenjaffe.com
Curator of Advent Planet - An Aggregation of Online Advent Calendars.


_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...



--
Kevin Karabian
Senior Engineer
Turnitin – www.turnitin.com
kkarabian[hidden email]
<a href="tel:510.764.7529" value="+15107647529" target="_blank">510.764.7529

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Reply | Threaded
Open this post in threaded view
|

Re: don't want to hit the db again

Dmitry Latin
Try set_cache.

$row->related_resultset($relationship_name)->set_cache( [ $related1,
$related2, ... ] );

Note, that for single-object relations (belongs_to, has_one, etc) you
also should provide array in set_cache.

On 29 April 2015 at 02:16, Kevin Karabian <[hidden email]> wrote:

> It is not about execution time, but data size.  The dba's don't like it when
> I tell them I am pulling x times more data than before.
>
> I am not against prefetching.  In fact, I had it as a prefetch, but, they
> balked about the potential increase in data returned.
>
>
> On Tue, Apr 28, 2015 at 3:51 PM, Len Jaffe <[hidden email]>
> wrote:
>>
>> Prefetch generates SQL joins. The join is the FUNDAMENTAL unit of work in
>> a relational database. In 25 years, the only time I've ever seen a join not
>> be faster than decomposing into multiple queries is when the queries involve
>> millions of rows in one or more tables, and/or when the RDBMs query
>> optimizer ran into a pathological case, or a bug.
>>
>> Try an experiment.  Benchmark the query using a column-specified prefetch,
>> and then benchmark running the first query, extracting the list of foreign
>> keys, and runnign a second query to return all of those. I'd like to see the
>> difference.
>>
>> I predict that for 100 records, the difference in execution time will be
>> below the threshold where it makes any sense to optimize the join away.
>>
>> Len.
>>
>>
>>
>>
>> On Tue, Apr 28, 2015 at 6:37 PM, Kevin Karabian <[hidden email]>
>> wrote:
>>>
>>> Yes, but the problem is for a has_many, prefetch causes multiple rows to
>>> be returned.  For example if a result object has 10 related objects then 10
>>> rows are returned for that one object.  If the result object is large, then
>>> that is a lot of repetition.  I want to basically run a query that will
>>> return only the 10 related objects and then put those into the result object
>>> eliminating the repetition.
>>>
>>> And actually my use case is that I want to do this for a large number or
>>> result objects so I am amortizing the cost by grabbing all the related
>>> objects for all the result objects in question.  So say, I have 10 objects
>>> and each has 10 related objects.  I get the 100 related objects at once and
>>> populate the result objects with the relationship data.
>>>
>>> On Tue, Apr 28, 2015 at 3:27 PM, Dagfinn Ilmari Mannsåker
>>> <[hidden email]> wrote:
>>>>
>>>> Kevin Karabian <[hidden email]> writes:
>>>>
>>>> > Hi,
>>>> >
>>>> > Is there a way to store already retrieved related objects (related as
>>>> > a
>>>> > has_many) in a result object, such that calling the accessor for that
>>>> > relationship data will not hit the db again.
>>>>
>>>> This is exactly what prefetch is for.
>>>>
>>>> https://metacpan.org/pod/DBIx::Class::ResultSet#prefetch
>>>> https://metacpan.org/pod/DBIx::Class::ResultSet#PREFETCHING
>>>>
>>>> --
>>>> - Twitter seems more influential [than blogs] in the 'gets reported in
>>>>   the mainstream press' sense at least.               - Matt McLeod
>>>> - That'd be because the content of a tweet is easier to condense down
>>>>   to a mainstream media article.                      - Calle Dybedahl
>>>>
>>>>
>>>> _______________________________________________
>>>> List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
>>>> IRC: irc.perl.org#dbix-class
>>>> SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
>>>> Searchable Archive:
>>>> http://www.grokbase.com/group/dbix-class@...
>>>
>>>
>>>
>>>
>>> --
>>> Kevin Karabian
>>> Senior Engineer
>>> Turnitin – www.turnitin.com
>>> [hidden email]
>>> 510.764.7529
>>>
>>> _______________________________________________
>>> List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
>>> IRC: irc.perl.org#dbix-class
>>> SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
>>> Searchable Archive:
>>> http://www.grokbase.com/group/dbix-class@...
>>
>>
>>
>>
>> --
>> Len Jaffe - Information Technology Smoke Jumper -
>> [hidden email]
>> 614-404-4214    @LenJaffe  www.lenjaffe.com
>> Host of Columbus Code Jam  - @CodeJamCMH
>> Curator of Advent Planet - An Aggregation of Online Advent Calendars.
>>
>>
>> _______________________________________________
>> List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
>> IRC: irc.perl.org#dbix-class
>> SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
>> Searchable Archive:
>> http://www.grokbase.com/group/dbix-class@...
>
>
>
>
> --
> Kevin Karabian
> Senior Engineer
> Turnitin – www.turnitin.com
> [hidden email]
> 510.764.7529
>
> _______________________________________________
> List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
> IRC: irc.perl.org#dbix-class
> SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
> Searchable Archive:
> http://www.grokbase.com/group/dbix-class@...



--
//wbr, Dmitry L.

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Reply | Threaded
Open this post in threaded view
|

Re: don't want to hit the db again

Len Jaffe
In reply to this post by Kevin Karabian
The dba's don't like it when I tell them I am pulling x times more data than before.
DBAs: This is why we can't have nice things.

So then just do two queries, perform in memory look-ups from one list of records into the other.



On Tue, Apr 28, 2015 at 7:16 PM, Kevin Karabian <[hidden email]> wrote:
It is not about execution time, but data size.  The dba's don't like it when I tell them I am pulling x times more data than before.  

I am not against prefetching.  In fact, I had it as a prefetch, but, they balked about the potential increase in data returned.


On Tue, Apr 28, 2015 at 3:51 PM, Len Jaffe <[hidden email]> wrote:
Prefetch generates SQL joins. The join is the FUNDAMENTAL unit of work in a relational database. In 25 years, the only time I've ever seen a join not be faster than decomposing into multiple queries is when the queries involve millions of rows in one or more tables, and/or when the RDBMs query optimizer ran into a pathological case, or a bug.

Try an experiment.  Benchmark the query using a column-specified prefetch, and then benchmark running the first query, extracting the list of foreign keys, and runnign a second query to return all of those. I'd like to see the difference.

I predict that for 100 records, the difference in execution time will be below the threshold where it makes any sense to optimize the join away.

Len.




On Tue, Apr 28, 2015 at 6:37 PM, Kevin Karabian <[hidden email]> wrote:
Yes, but the problem is for a has_many, prefetch causes multiple rows to be returned.  For example if a result object has 10 related objects then 10 rows are returned for that one object.  If the result object is large, then that is a lot of repetition.  I want to basically run a query that will return only the 10 related objects and then put those into the result object eliminating the repetition.  

And actually my use case is that I want to do this for a large number or result objects so I am amortizing the cost by grabbing all the related objects for all the result objects in question.  So say, I have 10 objects and each has 10 related objects.  I get the 100 related objects at once and populate the result objects with the relationship data.

On Tue, Apr 28, 2015 at 3:27 PM, Dagfinn Ilmari Mannsåker <[hidden email]> wrote:
Kevin Karabian <[hidden email]> writes:

> Hi,
>
> Is there a way to store already retrieved related objects (related as a
> has_many) in a result object, such that calling the accessor for that
> relationship data will not hit the db again.

This is exactly what prefetch is for.

https://metacpan.org/pod/DBIx::Class::ResultSet#prefetch
https://metacpan.org/pod/DBIx::Class::ResultSet#PREFETCHING

--
- Twitter seems more influential [than blogs] in the 'gets reported in
  the mainstream press' sense at least.               - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
  to a mainstream media article.                      - Calle Dybedahl


_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...



--
Kevin Karabian
Senior Engineer
Turnitin – www.turnitin.com
kkarabian[hidden email]
<a href="tel:510.764.7529" value="+15107647529" target="_blank">510.764.7529

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...



--
Len Jaffe - Information Technology Smoke Jumper - [hidden email] 
<a href="tel:614-404-4214" value="+16144044214" target="_blank">614-404-4214    @LenJaffe  www.lenjaffe.com
Curator of Advent Planet - An Aggregation of Online Advent Calendars.


_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...



--
Kevin Karabian
Senior Engineer
Turnitin – www.turnitin.com
kkarabian[hidden email]
<a href="tel:510.764.7529" value="+15107647529" target="_blank">510.764.7529

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...



--
Len Jaffe - Information Technology Smoke Jumper - [hidden email] 
614-404-4214    @LenJaffe  www.lenjaffe.com
Curator of Advent Planet - An Aggregation of Online Advent Calendars.


_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Reply | Threaded
Open this post in threaded view
|

Re: don't want to hit the db again

Kevin Karabian
In reply to this post by Dmitry Latin
Thanks so much!  This will come in very handy in many many places ahead.


On Tue, Apr 28, 2015 at 4:28 PM, Dmitry L. <[hidden email]> wrote:
Try set_cache.

$row->related_resultset($relationship_name)->set_cache( [ $related1,
$related2, ... ] );

Note, that for single-object relations (belongs_to, has_one, etc) you
also should provide array in set_cache.

On 29 April 2015 at 02:16, Kevin Karabian <[hidden email]> wrote:
> It is not about execution time, but data size.  The dba's don't like it when
> I tell them I am pulling x times more data than before.
>
> I am not against prefetching.  In fact, I had it as a prefetch, but, they
> balked about the potential increase in data returned.
>
>
> On Tue, Apr 28, 2015 at 3:51 PM, Len Jaffe <[hidden email]>
> wrote:
>>
>> Prefetch generates SQL joins. The join is the FUNDAMENTAL unit of work in
>> a relational database. In 25 years, the only time I've ever seen a join not
>> be faster than decomposing into multiple queries is when the queries involve
>> millions of rows in one or more tables, and/or when the RDBMs query
>> optimizer ran into a pathological case, or a bug.
>>
>> Try an experiment.  Benchmark the query using a column-specified prefetch,
>> and then benchmark running the first query, extracting the list of foreign
>> keys, and runnign a second query to return all of those. I'd like to see the
>> difference.
>>
>> I predict that for 100 records, the difference in execution time will be
>> below the threshold where it makes any sense to optimize the join away.
>>
>> Len.
>>
>>
>>
>>
>> On Tue, Apr 28, 2015 at 6:37 PM, Kevin Karabian <[hidden email]>
>> wrote:
>>>
>>> Yes, but the problem is for a has_many, prefetch causes multiple rows to
>>> be returned.  For example if a result object has 10 related objects then 10
>>> rows are returned for that one object.  If the result object is large, then
>>> that is a lot of repetition.  I want to basically run a query that will
>>> return only the 10 related objects and then put those into the result object
>>> eliminating the repetition.
>>>
>>> And actually my use case is that I want to do this for a large number or
>>> result objects so I am amortizing the cost by grabbing all the related
>>> objects for all the result objects in question.  So say, I have 10 objects
>>> and each has 10 related objects.  I get the 100 related objects at once and
>>> populate the result objects with the relationship data.
>>>
>>> On Tue, Apr 28, 2015 at 3:27 PM, Dagfinn Ilmari Mannsåker
>>> <[hidden email]> wrote:
>>>>
>>>> Kevin Karabian <[hidden email]> writes:
>>>>
>>>> > Hi,
>>>> >
>>>> > Is there a way to store already retrieved related objects (related as
>>>> > a
>>>> > has_many) in a result object, such that calling the accessor for that
>>>> > relationship data will not hit the db again.
>>>>
>>>> This is exactly what prefetch is for.
>>>>
>>>> https://metacpan.org/pod/DBIx::Class::ResultSet#prefetch
>>>> https://metacpan.org/pod/DBIx::Class::ResultSet#PREFETCHING
>>>>
>>>> --
>>>> - Twitter seems more influential [than blogs] in the 'gets reported in
>>>>   the mainstream press' sense at least.               - Matt McLeod
>>>> - That'd be because the content of a tweet is easier to condense down
>>>>   to a mainstream media article.                      - Calle Dybedahl
>>>>
>>>>
>>>> _______________________________________________
>>>> List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
>>>> IRC: irc.perl.org#dbix-class
>>>> SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
>>>> Searchable Archive:
>>>> http://www.grokbase.com/group/dbix-class@...
>>>
>>>
>>>
>>>
>>> --
>>> Kevin Karabian
>>> Senior Engineer
>>> Turnitin – www.turnitin.com
>>> [hidden email]
>>> <a href="tel:510.764.7529" value="+15107647529">510.764.7529
>>>
>>> _______________________________________________
>>> List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
>>> IRC: irc.perl.org#dbix-class
>>> SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
>>> Searchable Archive:
>>> http://www.grokbase.com/group/dbix-class@...
>>
>>
>>
>>
>> --
>> Len Jaffe - Information Technology Smoke Jumper -
>> [hidden email]
>> <a href="tel:614-404-4214" value="+16144044214">614-404-4214    @LenJaffe  www.lenjaffe.com
>> Host of Columbus Code Jam  - @CodeJamCMH
>> Curator of Advent Planet - An Aggregation of Online Advent Calendars.
>>
>>
>> _______________________________________________
>> List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
>> IRC: irc.perl.org#dbix-class
>> SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
>> Searchable Archive:
>> http://www.grokbase.com/group/dbix-class@...
>
>
>
>
> --
> Kevin Karabian
> Senior Engineer
> Turnitin – www.turnitin.com
> [hidden email]
> <a href="tel:510.764.7529" value="+15107647529">510.764.7529
>
> _______________________________________________
> List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
> IRC: irc.perl.org#dbix-class
> SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
> Searchable Archive:
> http://www.grokbase.com/group/dbix-class@...



--
//wbr, Dmitry L.



--
Kevin Karabian
Senior Engineer
Turnitin – www.turnitin.com
kkarabian[hidden email]
510.764.7529

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Reply | Threaded
Open this post in threaded view
|

Re: don't want to hit the db again

David Cantrell
In reply to this post by Kevin Karabian
On Tue, Apr 28, 2015 at 03:37:42PM -0700, Kevin Karabian wrote:

> Yes, but the problem is for a has_many, prefetch causes multiple rows to be
> returned.  For example if a result object has 10 related objects then 10
> rows are returned for that one object.  If the result object is large, then
> that is a lot of repetition.

We've come across a similar problem at work too. We had a situation
where there were a few very large rows which each had very many small
related rows. We were carefully using prefetch to make just 1 query:

  -- meadows are big but few, cows are small but many
  SELECT meadows.*, cows.*
    FROM meadows, cows
   WHERE cows.meadow_id = meadows.id;

but that ended up returning so much data (almost all of it repeated)
that reading it from the database, transferring it across the network,
and then unpacking it into perl structures took far too long. In the end
we made the code faster by not pre-fetching and letting DBIx::Class
follow the relationship (and issue a query) whenever we wanted
related rows:

  SELECT * FROM meadows;
  foreach (@meadows) {
    SELECT * FROM cows WHERE meadow_id = $_->id;
  }

That makes $#meadows + 2 queries instead of 1, but returns so much less
data that it's quicker anyway. We could make the code faster and have
fewer queries (although not just one) if we could do something like
this:

  # few rows, but each row is really really big
  SELECT * FROM meadows
  # each row is small but there are a lot of them
  @cows    = SELECT * FROM cows WHERE meadow_id in (map { $_->id } @meadows)
  iterate over @cows attaching them to appropriate meadow records

That returns exactly the same amount of data as the second version, but
in two queries. It's O(1) instead of O(N). There will be circumstances
where this is faster.

--
David Cantrell | Hero of the Information Age

     I'm in retox

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...