Postgres as Historian [PgSql]

Prev: (9.1) btree_gist support for searching on "notequals"
Next: english parser in text search: support for multiple words in the same position

From: Robert Haas on 2 Aug 2010 06:58

On Mon, Aug 2, 2010 at 3:20 AM, Hardik Belani <hardikbelani(a)gmail.com> wrote:
> We are using postgres as RDBMS for our product. There is a requirement
> coming for a feature which will require me to store data about various data
> points (mainly numbers) on a time scale. Data measurement is being taken
> every few secs/mins based and it is needed to be stored for statistical
> analysis. Now this requires numbers (integers/floats) to be stored at every
> mins.
>
> For this i can create a table with number and time (may be time offset
> instead of timestamp) as columns. But still it will require me to store huge
> number of rows in the order of few millions. Data is read only and only
> inserts can happen. But I need to perform all kinds of aggregation to get
> various statistics. for example: daily avg, monthly avg etc..
>
> We already are using postgres for our product so using postgres does not add
> any additional installation requirement and hence it is a bit easier.
>
> Would you recommand postgres for this kind of requirement and will be
> provide the performance.�OR do you recommand any�other database meant
> for�such requirements. I am also searching�for a good historian database if
> postgres doesn't suppport.

You can certainly use Postgres in this kind of environment, and I
have. Of course, if the volume of data is higher than your hardware
can keep up with, then you're going to have problems. Disabling
synchronous_commit may help, if losing the last few transactions is
acceptable in the event of a system crash; appropriate use of table
partitioning may help, too. There are certainly databases out there
that are better optimized for this case (consider the rrd stuff built
into mrtg, for example), but they're also not as feature-rich, so it's
a trade-off.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Etienne Dube on 2 Aug 2010 09:25

On 02/08/2010 3:20 AM, Hardik Belani wrote:
> We are using postgres as RDBMS for our product. There is a requirement
> coming for a feature which will require me to store data about various
> data points (mainly numbers) on a time scale. Data measurement is
> being taken every few secs/mins based and it is needed to be stored
> for statistical analysis. Now this requires numbers (integers/floats)
> to be stored at every mins.
> For this i can create a table with number and time (may be time offset
> instead of timestamp) as columns. But still it will require me to
> store huge number of rows in the order of few millions. Data is read
> only and only inserts can happen. But I need to perform all kinds of
> aggregation to get various statistics. for example: daily avg, monthly
> avg etc..
> We already are using postgres for our product so using postgres does
> not add any additional installation requirement and hence it is a bit
> easier.
> Would you recommand postgres for this kind of requirement and will be
> provide the performance. OR do you recommand any other database meant
> for such requirements. I am also searching for a good historian
> database if postgres doesn't suppport.
> Thanks,
> Hardik

Hi Hardik,

Data warehousing techniques could help you with your requirements of
aggregating large amounts of data. Have a look at "The Data Warehouse
Toolkit" by R. Kimball on how to design a star schema with aggregate
tables (these can be done as materialized views using PL/pgSQL and
triggers under postgres). You could also use an OLAP server (e.g.
Mondrian, which pretty nice and open source as well) on top of your
postgres DB, as it can use aggregate tables transparently when needed.

Etienne

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Greg Smith on 3 Aug 2010 20:10

Hardik Belani wrote:
> For this i can create a table with number and time (may be time offset
> instead of timestamp) as columns. But still it will require me to
> store huge number of rows in the order of few millions. Data is read
> only and only inserts can happen. But I need to perform all kinds of
> aggregation to get various statistics. for example: daily avg, monthly
> avg etc..
>

You've unfortunately asked on the wrong list about this. pgsql-hackers
is intended mainly for discussion related to the source code of
PostgreSQL, so this is off-topic for it. The people who like to argue
about the best way to implement aggregates and the like are on the
pgsql-performance list. You'd be more likely to get detailed responses
if you asked this question there. That group loves to talk about how to
design things for other people.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(a)2ndQuadrant.com www.2ndQuadrant.us

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Hardik Belani on 4 Aug 2010 02:57

Thanks for all your responses and my apologies for putting the question in
the wrong list.

I think OLAP is the answer for my requirements.

Regards,
Hardik

On Wed, Aug 4, 2010 at 5:40 AM, Greg Smith <greg(a)2ndquadrant.com> wrote:

> Hardik Belani wrote:
>
>> For this i can create a table with number and time (may be time offset
>> instead of timestamp) as columns. But still it will require me to store huge
>> number of rows in the order of few millions. Data is read only and only
>> inserts can happen. But I need to perform all kinds of aggregation to get
>> various statistics. for example: daily avg, monthly avg etc..
>>
>>
>
>
> You've unfortunately asked on the wrong list about this. pgsql-hackers is
> intended mainly for discussion related to the source code of PostgreSQL, so
> this is off-topic for it. The people who like to argue about the best way
> to implement aggregates and the like are on the pgsql-performance list.
> You'd be more likely to get detailed responses if you asked this question
> there. That group loves to talk about how to design things for other
> people.
>
>
> --
> Greg Smith 2ndQuadrant US Baltimore, MD
> PostgreSQL Training, Services and Support
> greg(a)2ndQuadrant.com www.2ndQuadrant.us <http://www.2ndquadrant.us/>
>
>

|
Pages: 1
Prev: (9.1) btree_gist support for searching on "notequals"
Next: english parser in text search: support for multiple words in the same position