Database too big.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Database too big.

Bill Moseley
I'm looking for some ideas here:

I'm working with a large Postgresql database.  It's getting too large and busy. There is an existing DBIC schema for this database that is used by a number of applications and tools.  

There is a subset of tables in this database that are mostly isolated from the other set of tables (i.e. joins are rare).  So, one thought here is to pull this subset of tables out into a separate database.

But, then how to minimize the changes needed to all existing code that uses DBIC for accessing this database?

One option is to hunt down every use of $schema for the subset of tables and replace it with $other_schema. 

But, is there anything I might do that is a bit more "clever"?:

$schema->resultset( 'Foo' )  # uses one database
$schema->resultset( 'Bar' )  # uses a different connection

so the existing application code would not requires so many changes?  (And, no, I'm not expecting joins to work across those.)

That is, perhaps have two $storage objects and swap them out based on the result source.

I'm sure it's not that simple -- for example thawing serialized DBIC objects would need to have the correct storage associated, too.

Any other ideas?



--
Bill Moseley
[hidden email]

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Reply | Threaded
Open this post in threaded view
|

Re: Database too big.

Torsten Raudssus
You can just wrap resultset which then dispatches to a second schema
object on this specific resultset-request, if your code really splits
up that 2 so hard, then it might work.

On Thu, Apr 3, 2014 at 2:47 PM, Bill Moseley <[hidden email]> wrote:

> I'm looking for some ideas here:
>
> I'm working with a large Postgresql database.  It's getting too large and
> busy. There is an existing DBIC schema for this database that is used by a
> number of applications and tools.
>
> There is a subset of tables in this database that are mostly isolated from
> the other set of tables (i.e. joins are rare).  So, one thought here is to
> pull this subset of tables out into a separate database.
>
> But, then how to minimize the changes needed to all existing code that uses
> DBIC for accessing this database?
>
> One option is to hunt down every use of $schema for the subset of tables and
> replace it with $other_schema.
>
> But, is there anything I might do that is a bit more "clever"?:
>
> $schema->resultset( 'Foo' )  # uses one database
> $schema->resultset( 'Bar' )  # uses a different connection
>
> so the existing application code would not requires so many changes?  (And,
> no, I'm not expecting joins to work across those.)
>
> That is, perhaps have two $storage objects and swap them out based on the
> result source.
>
> I'm sure it's not that simple -- for example thawing serialized DBIC objects
> would need to have the correct storage associated, too.
>
> Any other ideas?
>
>
>
> --
> Bill Moseley
> [hidden email]
>
> _______________________________________________
> List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
> IRC: irc.perl.org#dbix-class
> SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
> Searchable Archive:
> http://www.grokbase.com/group/dbix-class@...

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Reply | Threaded
Open this post in threaded view
|

Re: Database too big.

Peter Rabbitson-2
In reply to this post by Bill Moseley
On Thu, Apr 03, 2014 at 05:47:18AM -0700, Bill Moseley wrote:
>
> One option is to hunt down every use of $schema for the subset of tables
> and replace it with $other_schema.

This is what I would recommend. Can you give ideas of the scale of
changes required?
 
> But, is there anything I might do that is a bit more "clever"?:
>
> $schema->resultset( 'Foo' )  # uses one database
> $schema->resultset( 'Bar' )  # uses a different connection

A schema object is by definition a "proxy that ties mutliple sources and
a *single* storage object together". As such cleverness in this area is
likely to bite you. I can not think of specifics at this moment, but the
gut feeling is "don't do it".

Cheers


_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Reply | Threaded
Open this post in threaded view
|

Re: Database too big.

Bill Moseley



On Thu, Apr 3, 2014 at 7:07 AM, Peter Rabbitson <[hidden email]> wrote:
On Thu, Apr 03, 2014 at 05:47:18AM -0700, Bill Moseley wrote:
>
> One option is to hunt down every use of $schema for the subset of tables
> and replace it with $other_schema.

This is what I would recommend. Can you give ideas of the scale of
changes required?

Hard to quantify.  It's a few big apps and command-line utilities.   It's just normal work to hunt down every possible usage.   Making lots of changes has its risks, too.

But, not everything is as simple as literal $schema->resultset( ... ) lines everywhere, of course. There's code that says "Oh, this things works with "Foo", or this is "Foo" controller so I'll use $schema->resltset( 'Foo' )". That is, much of the high-level code assumes a single $schema.

DBIC is the common later, which would make it handy to solve there.

> But, is there anything I might do that is a bit more "clever"?:
>
> $schema->resultset( 'Foo' )  # uses one database
> $schema->resultset( 'Bar' )  # uses a different connection

A schema object is by definition a "proxy that ties mutliple sources and
a *single* storage object together". As such cleverness in this area is
likely to bite you. I can not think of specifics at this moment, but the
gut feeling is "don't do it".

Well, gut felling is a valid response.

Perhaps a different animal, but the replication code manages multiple database connections which made me think about doing something similar at the $storage layer.

Thanks,


--
Bill Moseley
[hidden email]

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Reply | Threaded
Open this post in threaded view
|

Re: Database too big.

Lasse Makholm
I have little experience with postgres myself, but it seem to that if you just want to split the database but keep one DBIx::Class schema, you should be able to do that using the postgres' schema search path which allows you to refer to tables in schemas other than the current without qualifying the tables names.

/L


On Thu, Apr 3, 2014 at 5:19 PM, Bill Moseley <[hidden email]> wrote:



On Thu, Apr 3, 2014 at 7:07 AM, Peter Rabbitson <[hidden email]> wrote:
On Thu, Apr 03, 2014 at 05:47:18AM -0700, Bill Moseley wrote:
>
> One option is to hunt down every use of $schema for the subset of tables
> and replace it with $other_schema.

This is what I would recommend. Can you give ideas of the scale of
changes required?

Hard to quantify.  It's a few big apps and command-line utilities.   It's just normal work to hunt down every possible usage.   Making lots of changes has its risks, too.

But, not everything is as simple as literal $schema->resultset( ... ) lines everywhere, of course. There's code that says "Oh, this things works with "Foo", or this is "Foo" controller so I'll use $schema->resltset( 'Foo' )". That is, much of the high-level code assumes a single $schema.

DBIC is the common later, which would make it handy to solve there.

> But, is there anything I might do that is a bit more "clever"?:
>
> $schema->resultset( 'Foo' )  # uses one database
> $schema->resultset( 'Bar' )  # uses a different connection

A schema object is by definition a "proxy that ties mutliple sources and
a *single* storage object together". As such cleverness in this area is
likely to bite you. I can not think of specifics at this moment, but the
gut feeling is "don't do it".

Well, gut felling is a valid response.

Perhaps a different animal, but the replication code manages multiple database connections which made me think about doing something similar at the $storage layer.

Thanks,


--
Bill Moseley
[hidden email]

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...


_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...