Sunday, October 16th, 2011

I you ever stumble over the error message defining the title of this blog entry:

OA> connect server 1
Invalid MP IP Address: 0.0.0.0
Operation failed.

And your EBIPA settings looks like this:

OA> show ebipa
EBIPA Device Server Settings
Bay Enabled EBIPA/Current   Netmask         Gateway         DNS             Domain
--- ------- --------------- --------------- --------------- --------------- ------
  1   Yes   192.168.101.31  255.255.255.0   0.0.0.0         192.168.101.1
            0.0.0.0                                        
...

Please don't try to reset the EBIPA settings, don't try to fail-over or restart the OA, don't try to reset the specific iLO. Good ideas, but none of these actions will give you the satisfying success you're looking for.

All you need to do is to power cycle the whole blade center. Easy, isn't it?

Wednesday, February 3rd, 2010

26EDF8A1-7086-43DB-8825-5C7B5121089D.jpg

Yesterday Facebook announced the oncoming release of HipHop for PHP, a PHP to C++ compiler to speed up your PHP application. It's not a new idea to compile a scripting language to a compiler language, but as far as I know, in case of PHP it's the first time anyone has ever tried this.

The source code is not yet available so it's hard to tell if this approach is feasible in real word scenarios. At least Facebook asserts that they use it to serve about 90% of their web traffic.

I assume HipHop will be quite helpful if you're hosting a large and mature PHP application. But please. Before you jump to conclusions that all your scaling problems are now solved, they aren't. Scaling is not only about speeding up the execution of your PHP code, it's also about I/O, network and databases. So, HipHop is one single step, but it's definitely not the Holy Grail. Everyone with some years' experience knows that there is no Holy Grail. At least not in IT.

Some people already complain about the lack of eval() in HipHop, which is obviously impossible to implement in a compiler language. But one the other hand, the same people arguing against the use eval() at all, because it's a potential security risk and also tends to support bad programming style. I don't know, I think I can live easily without eval().

Very curiously looking forward to the public release of HipHop.

Wednesday, January 20th, 2010

A couple of days ago, I stumbled over an installation in which CGI was used to run a Python-based web application. Of course the applications ran terribly slow, and as I mentioned earlier in »Save energy! Stop using CGI!«, it's (nowadays) always a bad idea to use CGI. Not only it's tediously slow and bad software design, it's also soooo 90's.

What's the difference between CGI and FastCGI?

Let me use a metaphor to start. Imagine a well...

python_cgi.jpg python_fastcgi.jpg
Here you see the old-fasioned way of CGI: For every request you have to let the bucket all the way down into the well (fork a new process), allowing water to enter the bucket (initialize and execute your application), pull the bucket up to the surface and empty it (send the data to the web server and free all allocated memory). And here is the modern FastCGI way: Install the faucet (start the FastCGI process) and every time you need water, turn it on (connect and send a request), get water (calculate and get the answer), and turn it off (close the connection). No need to fork, initialize your application, and free the allocated memory on every single request.

Okay, seriously, let me show you how this works in practice.

Installing Python

For this demo I use Sun's Web Stack. It's probably the easiest way to demonstrate the performance differences between CGI and FastCGI. XAMPP doesn't support FastCGI, because with mod_perl for Perl and mod_php for PHP there is no real need for a FastCGI interface.

First, let me add Python to my basic web stack installation:

[oswald@sol10u7 ~/webstack1.5]% bin/pkg install sun-python26
DOWNLOAD                                    PKGS       FILES     XFER (MB)
Completed                                    1/1   2784/2784   12.65/12.65
PHASE                                        ACTIONS
Install Phase                              2861/2861
PHASE                                          ITEMS
Reading Existing Index                           7/7
Indexing Packages                                1/1
[oswald@sol10u7 ~/webstack1.5]% bin/setup-webstack

If you're familiar with Sun's Web Stack, you'll have noticed that I'm using the IPS installation of Web Stack. That's my favorite installation way, because it allows me to place the Web Stack in any directory I want and also allows me to run the stack without the need of root privileges.

Python with CGI

Setting up CGI is very, very easy and probably that's exactly the reason why so many people still use it.

Let me start with a simple "Hello World!" Python CGI script:

#!/home/oswald/webstack1.5/bin/python
print ""
print "Hello World!"

I named this file hello.py and put it into the cgi-bin directory of my Apache installation. In the case of Web Stack it's var/apache2/2.2/cgi-bin. Add execute permissions:

[oswald@sol10u7 ~]% chmod a+x var/apache2/2.2/cgi-bin/hello.py

Now I log into another box on the same network and use my favorite command-line web browser Lynx to test the newly created Hello World CGI:

[oswald@debian50 ~]% lynx -source http://sol10u7/cgi-bin/hello.py
Hello World!

Looks good. Now let's benchmark this script:

[oswald@debian50 ~]% ab -n 1000 http://sol10u7/cgi-bin/hello.py
...
Time taken for tests:   31.083 seconds
...
Total transferred:      256000 bytes
HTML transferred:       13000 bytes
Requests per second:    32.17 [#/sec] (mean)
...

32 requests/second. That's nothing to be proud of!

Python with FastCGI

And now let's try FastCGI by adding Apache's mod_fcgid to the Web Stack installation:

[oswald@sol10u7 ~/webstack1.5]% bin/pkg install sun-apache22-fcgid
DOWNLOAD                                    PKGS       FILES     XFER (MB)
Completed                                    1/1         6/6     0.09/0.09
PHASE                                        ACTIONS
Install Phase                                  24/24
PHASE                                          ITEMS
Reading Existing Index                           7/7
Indexing Packages                                1/1
[oswald@sol10u7 ~/webstack1.5]% bin/setup-webstack

Activate the default configuration:

[oswald@sol10u7 ~/webstack1.5]% cp etc/apache2/2.2/samples-conf.d/fcgid.conf etc/apache2/
2.2/conf.d/

For those, who are not able or don't want to use Sun's Web Stack, the above fcgid.conf file basically contains the following directives:

LoadModule fcgid_module libexec/mod_fcgid.so
SharememPath /home/oswald/webstack1.5/var/run/apache2/2.2/fcgid_shm
SocketPath /home/oswald/webstack1.5/var/run/apache2/2.2/fcgid.sock
AddHandler fcgid-script .fcgi
<Location /fcgid>
SetHandler fcgid-script
Options ExecCGI
allow from all
</Location>

As usual after changing Apache's configuration, we need to reload (aka graceful restart) the Apache to let the new configuration take effect:

[oswald@sol10u7 ~/webstack1.5]% apache2/2.2/bin/apachectl graceful

Now I create a new directory named fcgid directly inside of Apache's document root folder and change into that folder:

[oswald@sol10u7 ~/webstack1.5]% mkdir var/apache2/2.2/htdocs/fcgid
[oswald@sol10u7 ~/webstack1.5]% cd var/apache2/2.2/htdocs/fcgid

To let Python to talk with my Apache's mod_fcgid I need to install a so-called Python FastCGI/WSGI gateway. There are several solutions available for Python, but I personally prefer Allan Saddi's fcgi.py:

[oswald@sol10u7 htdocs/fcgid]% wget -q http://svn.saddi.com/py-lib/trunk/fcgi.py

The "Hello World!" Python FastCGI script looks a little different this time:

#!/home/oswald/webstack1.5/bin/python
from fcgi import WSGIServer
def app(environ, start_response):
start_response('200 OK', [('Content-Type', 'text/html')])
return('''Hello world!\n''')
WSGIServer(app).run()

This time it's not the output of a script which is sent back to the browser, it's the return value of a function add() defining the data which goes to the user's browser. In this case it's the simple character string "Hello World!\n".

Like in the CGI example above, the Python script needs to be executable:

[oswald@sol10u7 htdocs/fcgid]% chmod a+x hello.py

The content of my fcgid directory now looks like this:

[oswald@sol10u7 htdocs/fcgid]% ls -l
total 90
-rw-r--r--   1 oswald   other      44113 Jul 26  2006 fcgi.py
-rwxr-xr-x   1 oswald   other        223 Jan 19 12:48 hello.py

And - like in my CGI example above - I now test the script with Lynx:

[oswald@debian50 ~]% lynx -source http://sol10u7/fcgid/hello.py
Hello world!

And after everything looks fine, I start a little benchmark:

[oswald@debian50 ~]% ab -q -n 1000 http://sol10u7/fcgid/hello.py
...
Time taken for tests:   1.747 seconds
...
Total transferred:      235000 bytes
HTML transferred:       13000 bytes
Requests per second:    572.44 [#/sec] (mean)
...

Yes, gotcha. 572 requests per seconds: that sounds reasonable. Remember the 32 requests/second from CGI? Do you want the well or do you take the faucet? Sure, implementing a FastCGI program is far more challenging then coding a simple CGI solution, but 572 against 32 requests per second? Do I need to say more?

Fotos: On the right "Faucet" by Joe Shlabotnik, and on the left "Well" by echiner1. Both licensed under Creative Commons.

Tuesday, January 12th, 2010

If you're a Web Stack user, please read Jyri's brief article about Web Stack and the TLS Vulnerability.

Thursday, January 7th, 2010

Happy New Year everyone! Hope you could enjoy your holidays!!

Let's start this year with the third part of my little series of thoughts about caching. After my small memcached intro and thoughts about caching architectures, I now focus on the data you should consider to cache in your web application.

cache-what.png

[1] Cache HTML

Obviously the biggest performance win you can achieve is by caching the whole output of your web application: a simple reverse proxy scenario. This works very well for mostly static pages, but for highly dynamical and user-specific content this is not an option: there is no advantage in caching a web page, which gets obsolete within the next moment.

Probably the best way to solve this dilemma is to implement a so-called partial-page cache: Let your application cache just portions of the page and leave the rest, where it makes no sense to cache, dynamic.

It's very important that you implement this in a very top layer of your application. Probably exactly that layer, which software architects will call presentation layer. Sure, this is likely to break you framework architecture, but to quote chapter 55 of the Tao Te Ching:

The movement of the Tao
By contraries proceeds;
And weakness marks the course
Of Tao's mighty deeds.

But seriously: If you have to stay in the boundaries of a framework, Ajax is a good way to bypass this restrictions and helps to implement such a cache in a restricted architecture. But be aware that this will raise the number of HTTP requests on your frontend web servers.

An effective caching strategy will always mess your beautifully designed software architecture up. Having just one (central) caching layer looks great in system diagrams and it's better than no cache at all, but it's definitely not the end of the rope.

[2] Cache complex data structures

If you don't want to break your framework architecture or you don't like the idea of caching HTML at all, and I totally understand your point, you should consider about caching other (lower level, but still complex) structures of data.

Some examples for suitable data structures:

  • user profiles
  • friends lists
  • current user lists
  • list of locations, branches, countries, languages, ...
  • top 10 (whatever) lists
  • public statistical data
  • ...

The main challenge lies in identifying the most proper data structures. This is no easy task and strongly depends on the kind of web application you run or plan to run. Avoid caching simple data sets, like row-level data from the database. Don't think row-level.
That's the best advice you should keep in mind. (Note to myself: I need to put this on a t-shirt. I found this phrase in Memcached Internals, a wonderful article inspired by a talk by Brian Aker and Alan Kasindorf.)

At a first glance Ajax may be an obvious technology to combine with such a cache. But please be aware that moving application logic away from the server-side application to the client side is always a very dangerous task, which easily may compromise the security of your application.

Which allows me to end this post with another quote from Laozi (Tao Te Ching, chapter 63):

All difficult things in the world
are sure to arise from a previous state
in which they were easy.

Thursday, December 17th, 2009

On Tuesday I focused mainly on memcached and PHP, but today I'll take a wider look at cacheing architectures in general. The main question about defining a cache architecture is to decide where to locate the caching component:

cache-architectures.png

[1] Status quo, the three-tier architecture

In theory, the commonly accepted standard architecture of a software product is divided into three tiers: the presentation tier, the application tier and finally the data tier. In the context of web applications we rediscover this tiers in the trinity of web server, application server and database server.

In the above diagram we find these three tiers with the user (or in technical terms: the browser) on top of this stack.

[2] Cache on top

One very obvious idea is to place the cache in front of the web server, between user and web server. Usually we find this architecture in a so called reverse proxy configuration. A reverse proxy is quite easy to set up and has a positive impact for web sites with more static content. But for highly dynamic web applications - like most of today's Web 2.0 applications - the caching benefit of a reverse proxy may be not that big.

In general: having a reverse proxy is better than no caching at all. A reverse proxy will give you always a performance benefit.

[3] Cache in between of web and application server

Let's move the cache one level down the stack in between web server and application server. On the fist sight this may look like a very good idea, because the cache now protects the application server. But on the second sight you'll realize that this configuration is mostly the same as that one from architecture 2, just without the benefit also caching your web server's data.

For exotic scenarios there may be a good reason for this configuration (esp. in combination with load balancing functionality) but in general you should favor architecture 2 over this one.

[4] Cache in between of application and database

And another level down in the stack. The cache now sits between application server and database. Again this looks good, and seems to be a good idea - on the first sight. But on the second or third sight you may realize that nearly every database system has its own internal query cache and our cache is only a cache for a cache. And caching a cache is basically never a good idea and can lead to unpredictable, bad consequences.

Another difficulty with this approach is that it's hard to decide when the cache gets dirty (cache jargon for obsolete) and when it's time to clear the cache.

[5] Cache inside of application

And now half a level up again: right into the application tier. This is the most challenging but also the most powerful place to implement caching strategies. Identify time-critical and frequently accessed data during the development process and implement dedicated and customized caching mechanisms. But don't try do build an abstract, unified, common cache for everything.

It's very important to find a specific and suitable solutions for each kind of data you want to cache in your application. Otherwise you'll will probably just end with another row-based cache for your database (like architecture 4) or some kind of reverse proxy (like architecture 2).

Conclusion

Architectures 2, 3 and 4 can be easily setup by system administration without having to involve development in any way. It's mostly a matter of clever configuration which also may add some load balancing features. In general you'll definitely achieve a better performance of your application, but there is always a given limit by the architecture and scaling quality of your core application.

Architecture 5 is probably the best choice, but - to get best results - needs to be started in an early stage and during the whole development and designing process of your web application you should always have caching in mind. What data is most frequently accessed? What data is expensive (hard to retrieve)? What data depends on user sessions? How up to date does the data need to be?

If you are curious about these questions, please stay tuned for part 3.

Tuesday, December 15th, 2009

Caching is probably the most important technique you should use in nowadays web sites or web application. Sure, scaling your hardware is still the final answer to all your load problems, but with some kind of caching your application will scale far better rather than without.

cachecachecache.png

Currently my favorite caching tool is memcached. It's a slim and ultra fast distributed caching system. Memcached is basically a key-value store, which stores all data non-persistently in memory and if your server goes down all the data is also gone because it's not stored somewhere on a hard disk.

Memcached is not meant to be a database, and you'll still need a database to store your data persistently.

Setting up memcached

I'm a very lazy guy and try to avoid boring duties like installing memcached. That's why I love using Sun's Web Stack, which already includes memcached and is so easy to use. If you're not a Web Stack user please take a look at the memcached FAQ to learn how to install memcached on your system.

To add memcached to my IPS-based Web Stack installation I simply call these two commands:

[oswald@localhost ~/demo]$ bin/pkg install sun-memcached
DOWNLOAD                                    PKGS       FILES     XFER (MB)
Completed                                    1/1         9/9     0.17/0.17
PHASE                                        ACTIONS
Install Phase                                  30/30
PHASE                                          ITEMS
Reading Existing Index                           7/7
Indexing Packages                                1/1
[oswald@localhost ~/demo]$ bin/setup-webstack

Now all I need to do is to start the daemon:

[oswald@localhost ~/demo]$ bin/sun-memcached start
Starting memcached

Memcached has no support for any access control at all and you should use memcached only on private networks or secure you installation with a firewall (port 11211, by the way).

Using memcached with PHP

As I already mentioned memcached is a simple key-value store which is very easy to use for programmers. To show the basic idea I put this small PHP script together:

<?php
$memcache = new Memcache();
if(!$memcache->connect('localhost', 11211))
die("Couldn't connect to memcached! Cruel world!");
$key="zaphod";
$result = $memcache->get($key);
if($result)
{
echo "$key is $result";
}
else
{
$value="cool";
echo "Set $key to $value";
$memcache->set($key,$value);
}
?>

There are three main functions you will need to understand in order to work with memcached:

connect(host,port)
to connect to your memcached server. If you have multiple memcached servers running you can use addServer() to add one or more servers to the connection pool.
get(key)
Retrieves the value for the given key.
set(key,value)
Stores the given value for the given key. set() also allows you to define an expiration time for the key-value pair.

On the first execution of this script the cache is empty and you'll get this output:

Set zaphod to cool

On the second execution, the value for zaphod is already set and you'll see:

zaphod is cool

That's all. That's the basic way to use memcached.

What's next...

The next step is to decide what information you want to cache and where do you want to cache. Both are very crucial decisions which determine success or failure of your cache. So, stay tuned for part 2. ;)

Thursday, December 10th, 2009

»...just as soon as we are sure what is normal anyway. Thank you.« (HHGTTG)

The last few weeks were a little quiet here in this blog. I had to do some urgent programming for the next release of our Web Stack and last week I had the great pleasure to talk about web application development at the Codebits conference in Lisbon.

4166227540_48a7f716b6_o.jpg
Photography by Lenz Grimmer.

Thursday, November 19th, 2009

PHP and sessions: Very simple to use, but not as simple to understand as we might want to think.

session.gc_maxlifetime

This value (default 1440 seconds) defines how long an unused PHP session will be kept alive. For example: A user logs in, browses through your application or web site, for hours, for days. No problem. As long as the time between his clicks never exceed 1440 seconds. It's a timeout value.

PHP's session garbage collector runs with a probability defined by session.gc_probability divided by session.gc_divisor. By default this is 1/100, which means that above timeout value is checked with a probability of 1 in 100.

session.cookie_lifetime

This value (default 0, which means until the browser's next restart) defines how long (in seconds) a session cookie will live. Sounds similar to session.gc_maxlifetime, but it's a completely different approach. This value indirectly defines the "absolute" maximum lifetime of a session, whether the user is active or not. If this value is set to 60, every session ends after an hour a minute.

Wednesday, November 18th, 2009

There are a lot of tutorial out there describing how to use PHP's classic MySQL extension to store and retrieve blobs. There are also many tutorials how to use PHP's MySQLi extension to use prepared statements to fight SQL injections in your web application. But there are no tutorials about using MySQLi with any blob data at all.

Until today... ;)

Preparing the database

Okay, first I need a table to store my blobs. In this example I'll store images in my database because images usually look better in a tutorial than some random raw data.

mysql> CREATE TABLE images (
	id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
	image MEDIUMBLOB NOT NULL,
	PRIMARY KEY (id)
);
Query OK, 0 rows affected (0.02 sec)

In general you don't want to store images in a relational database. But that's another discussion for another day.

Storing the blob

To make a long story short, here's the code to store a blob using MySQLi:

<?php
	$mysqli=mysqli_connect('localhost','user','password','db');
	if (!$mysqli)
		die("Can't connect to MySQL: ".mysqli_connect_error());
	$stmt = $mysqli->prepare("INSERT INTO images (image) VALUES(?)");
	$null = NULL;
	$stmt->bind_param("b", $null);
	$stmt->send_long_data(0, file_get_contents("osaka.jpg"));
	$stmt->execute();
?>

If you already used MySQLi, most of the above should look familiar to you. I highlighted two pieces of code, which I think are worth looking at:

  1. The $null variable is needed, because bind_param() always wants a variable reference for a given parameters. In this case the "b" (as in blob) parameter. So $null is just a dummy, to make the syntax work.
  2. In the next step I need to "fill" my blob parameter with the actual data. This is done by send_long_data(). The first parameter of this method indicates which parameter to associate the data with. Parameters are numbered beginning with 0. The second parameter of send_long_data() contains the actual data to be stored.

While using send_long_data(), please make sure that the blob isn't bigger than MySQL's max_allowed_packet:

mysql> SHOW VARIABLES LIKE 'max_allowed_packet';
+--------------------+----------+
| Variable_name      | Value    |
+--------------------+----------+
| max_allowed_packet | 16776192 |
+--------------------+----------+
1 row in set (0.00 sec)

If your data exceeds max_allowed_packet, you probably don't get any errors returned from send_long_data() or execute(). The saved blob is just corrupt!

Simply raise the value max_allowed_packet to whatever you'll need. If you're not able to change MySQL's configuration, you'll need to send the data in smaller chunks:

	$fp = fopen("osaka.jpg", "r");
	while (!feof($fp))
	{
		$stmt->send_long_data(0, fread($fp, 16776192));
	}

Usually the default value of 16M should be a good start.

Retrieving the blob

Getting the blob data out of the database is quite simple and follows the usual way of MySQLi:

<?php
	$mysqli=mysqli_connect('localhost','user','password','db');
	if (!$mysqli)
		die("Can't connect to MySQL: ".mysqli_connect_error());
	$id=1;
	$stmt = $mysqli->prepare("SELECT image FROM images WHERE id=?");
	$stmt->bind_param("i", $id);
	$stmt->execute();
	$stmt->store_result();
	$stmt->bind_result($image);
	$stmt->fetch();
	header("Content-Type: image/jpeg");
	echo $image;
?>

Connect to the database, prepare the SQL statement, bind the parameter(s), execute the statement, bind the result to a variable, and fetch the actual data from the database. In this case there is no need to worry about max_allowed_packet. MySQLi will do all the work:

3925128491.jpg

By the way...

If you want to insert a blob from the command line using MySQL monitor, you can use LOAD_FILE() to fetch the data from a file:

mysql> INSERT INTO images (image) VALUES( LOAD_FILE("/home/oswald/osaka.jpg") );

Be aware that also in this case max_allowed_packet limits the amount of data you're able to send to the database:

mysql> SHOW VARIABLES LIKE 'max_allowed_packet';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| max_allowed_packet | 7168  |
+--------------------+-------+
1 row in set (0.00 sec)
mysql> INSERT INTO images (image) VALUES( LOAD_FILE("/home/oswald/osaka.jpg") );
ERROR 1048 (23000): Column 'image' cannot be null
mysql> SET @@max_allowed_packet=16777216;
Query OK, 0 rows affected (0.00 sec)
mysql> SHOW VARIABLES LIKE 'max_allowed_packet';
+--------------------+----------+
| Variable_name      | Value    |
+--------------------+----------+
| max_allowed_packet | 16777216 |
+--------------------+----------+
1 row in set (0.00 sec)
mysql> INSERT INTO images (image) VALUES( LOAD_FILE("/home/oswald/osaka.jpg") );
Query OK, 1 row affected (0.03 sec)

This blog copyright 2010-2012 by Kai 'Oswald' Seidler