源码网商城,靠谱的源码在线交易网站 我的订单 购物车 帮助

源码网商城

解析strtr函数的效率问题

  • 时间:2022-02-09 02:00 编辑: 来源: 阅读:
  • 扫一扫,手机访问
摘要:解析strtr函数的效率问题
最近经常要对字符串进行匹配和替换操作,之前一般使用str_replace或者preg_replace,据说strtr的效率不错,所以对比了一下:
[u]复制代码[/u] 代码如下:
$i = 0; $t = microtime(true); for(;$i<1000;$i++) {     $str = strtr(md5($i), $p2); } var_dump(microtime(true)-$t);    //0.085476875305176 $t = microtime(true); for(;$i<2000;$i++) {     $str = preg_replace($p, '', md5($i)); } var_dump(microtime(true)-$t);   //0.09863805770874
[b]结果显示,strtr的效率比preg_replace高约15%左右。 [/b]趁着周末,查看了strtr的php源码:
[u]复制代码[/u] 代码如下:
PHP_FUNCTION(strtr) {         zval **str, **from, **to;         int ac = ZEND_NUM_ARGS();         //参数检查(zend_get_parameters_ex函数定义在zend_api.c文件中)         if (ac < 2 || ac > 3 || zend_get_parameters_ex(ac, &str, &from, &to) == FAILURE) {                 WRONG_PARAM_COUNT;         }         //参数检查         if (ac == 2 && Z_TYPE_PP(from) != IS_ARRAY) {                 php_error_docref(NULL TSRMLS_CC, E_WARNING, "The second argument is not an array.");                 RETURN_FALSE;         }         convert_to_string_ex(str);         /* shortcut for empty string */         //宏Z_STRLEN_PP定义在zend_operators.h         if (Z_STRLEN_PP(str) == 0) {                 RETURN_EMPTY_STRING();         }         if (ac == 2) {                 php_strtr_array(return_value, Z_STRVAL_PP(str), Z_STRLEN_PP(str), HASH_OF(*from));         } else {                 convert_to_string_ex(from);                 convert_to_string_ex(to);                 ZVAL_STRINGL(return_value, Z_STRVAL_PP(str), Z_STRLEN_PP(str), 1);                 php_strtr(Z_STRVAL_P(return_value),                                   Z_STRLEN_P(return_value),                                   Z_STRVAL_PP(from),                                   Z_STRVAL_PP(to),                                   MIN(Z_STRLEN_PP(from),                                   Z_STRLEN_PP(to)));         } }
[b]先看看php_strtr函数: [/b]
[u]复制代码[/u] 代码如下:
//trlen是字符串str_from与str_to的长度的最小值 PHPAPI char *php_strtr(char *str, int len, char *str_from, char *str_to, int trlen) {         int i;         unsigned char xlat[256]; //         if ((trlen < 1) || (len < 1)) {                 return str;         }         //xlat的下标与值相等         for (i = 0; i < 256; xlat[i] = i, i++);         //把from到to字符串的每一个字符对应起来。例如:from="ab",to="cd",则会产生这样的对应'a'=>'c', 'b'=>'d'。         for (i = 0; i < trlen; i++) {                 xlat[(unsigned char) str_from[i]] = str_to[i];         }         //替换(不过觉得这个函数的效率还有可以改进的地方,因为如果需要替换的字符只是占整个字符串很少的部分,这样就有大部分的赋值操作其实并没有什么意义,这样的情况下感觉先判断再赋值感觉会高效一点。有空测试一下)         for (i = 0; i < len; i++) {                 str[i] = xlat[(unsigned char) str[i]];         }         return str; }
可见,在处理strtr('abcdaaabcd', 'ab', 'efd')这样的操作时,应该是很高效的。 (注意:这个操作输出efcdeeefcd)  [b]再看看php_strtr_array: [/b]
[u]复制代码[/u] 代码如下:
static void php_strtr_array(zval *return_value, char *str, int slen, HashTable *hash) {         zval **entry;         char  *string_key;         uint   string_key_len;         zval **trans;         zval   ctmp;         ulong num_key;         int minlen = 128*1024;         int maxlen = 0, pos, len, found;         char *key;         HashPosition hpos;         smart_str result = {0};         HashTable tmp_hash;         //把替换数组从hash复制到tmp_hash,并记录下标字符串的最大和最小长度         zend_hash_init(&tmp_hash, 0, NULL, NULL, 0);         zend_hash_internal_pointer_reset_ex(hash, &hpos);         while (zend_hash_get_current_data_ex(hash, (void **)&entry, &hpos) == SUCCESS) {                 switch (zend_hash_get_current_key_ex(hash, &string_key, &string_key_len, &num_key, 0, &hpos)) {                         case HASH_KEY_IS_STRING:                                 len = string_key_len-1;                                 if (len < 1) {                                         zend_hash_destroy(&tmp_hash);                                         RETURN_FALSE;                                 }                                 zend_hash_add(&tmp_hash, string_key, string_key_len, entry, sizeof(zval*), NULL);                                 if (len > maxlen) {                                         maxlen = len;                                 }                                 if (len < minlen) {                                         minlen = len;                                 }                                 break;                         //下标如果是整形的话会转换成字符串类型,例如:array(10=>'aa')转换成array('10'=>'aa')                         case HASH_KEY_IS_LONG:                                 Z_TYPE(ctmp) = IS_LONG;                                 Z_LVAL(ctmp) = num_key;                                 convert_to_string(&ctmp);                                 len = Z_STRLEN(ctmp);                                 zend_hash_add(&tmp_hash, Z_STRVAL(ctmp), len+1, entry, sizeof(zval*), NULL);                                 zval_dtor(&ctmp);                                 if (len > maxlen) {                                         maxlen = len;                                 }                                 if (len < minlen) {                                         minlen = len;                                 }                                 break;                 }                 zend_hash_move_forward_ex(hash, &hpos);         }         key = emalloc(maxlen+1);         pos = 0;         //从字符串的第一个字符开始循环匹配,pos记录当前查找的位置         while (pos < slen) {                 //当前位置加上最大长度,如果大于字符串长度,则最大长度就需要改变                 if ((pos + maxlen) > slen) {                         maxlen = slen - pos;                 }                 found = 0;                 memcpy(key, str+pos, maxlen);                 //从最大长度开始匹配,就是说对'abcd',若array('a'=>'e','ab'=>'f'),则会先把ab替换为f,而不是先把a换成e。                 for (len = maxlen; len >= minlen; len--) {                         key[len] = 0;                         //因为使用了hash表,所以这样的效率还是挺高的                         if (zend_hash_find(&tmp_hash, key, len+1, (void**)&trans) == SUCCESS) {                                 char *tval;                                 int tlen;                                 zval tmp;                                 if (Z_TYPE_PP(trans) != IS_STRING) {                                         tmp = **trans;                                         zval_copy_ctor(&tmp);                                         convert_to_string(&tmp);                                         tval = Z_STRVAL(tmp);                                         tlen = Z_STRLEN(tmp);                                 } else {                                         tval = Z_STRVAL_PP(trans);                                         tlen = Z_STRLEN_PP(trans);                                 }                                 //加入结果                                 smart_str_appendl(&result, tval, tlen);                                 //向前跳跃                                 pos += len;                                 found = 1;                                 if (Z_TYPE_PP(trans) != IS_STRING) {                                         zval_dtor(&tmp);                                 }                                 break;                         }                 }                 if (! found) {                         smart_str_appendc(&result, str[pos++]);                 }         }         efree(key);         zend_hash_destroy(&tmp_hash);         smart_str_0(&result);         RETVAL_STRINGL(result.c, result.len, 0); }
  • 全部评论(0)
联系客服
客服电话:
400-000-3129
微信版

扫一扫进微信版
返回顶部