random

#random#

已有0人关注此标签

内容分类

游客886

1百万数据点查, 你是怎么做到69万的tps的

[postgres@cnsz92pl00192 data]$pgbench -M prepared -h 127.0.0.1 -p 10002 -n -r -f ~/data/test.sql -c 64 -j 64 -T 10 transaction type: /home/postgres/data/test.sqlscaling factor: 1query mode: preparednumber of clients: 64number of threads: 64duration: 10 snumber of transactions actually processed: 49050latency average = 13.107 mstps = 4882.839136 (including connections establishing)tps = 8783.875162 (excluding connections establishing)statement latencies in milliseconds: 0.002 \set id random(1,1000000) 7.225 select id from tbl_ip where id=:id; [postgres@cnsz92pl00192 data]$pgbench -M prepared -h 127.0.0.1 -p 10002 -n -r -f ~/data/test.sql -c 64 -j 64 -T 20transaction type: /home/postgres/data/test.sqlscaling factor: 1query mode: preparednumber of clients: 64number of threads: 64duration: 20 snumber of transactions actually processed: 95789latency average = 13.400 mstps = 4776.021870 (including connections establishing)tps = 8375.462545 (excluding connections establishing)statement latencies in milliseconds: 0.002 \set id random(1,1000000) 7.583 select id from tbl_ip where id=:id; [postgres@cnsz92pl00192 data]$pgbench -M prepared -h 127.0.0.1 -p 10002 -n -r -f ~/data/test.sql -c 128 -j 128 -T 20transaction type: /home/postgres/data/test.sqlscaling factor: 1query mode: preparednumber of clients: 128number of threads: 128duration: 20 snumber of transactions actually processed: 105182latency average = 24.477 mstps = 5229.411053 (including connections establishing)tps = 18263.330523 (excluding connections establishing)statement latencies in milliseconds: 0.001 \set id random(1,1000000) 6.890 select id from tbl_ip where id=:id; 本问题来自云栖社区【PostgreSQL技术进阶社群】。https://yq.aliyun.com/articles/690084 点击链接欢迎加入社区大社群。

游客886

有大佬知道pg analyze 对大表采用的random sample算法具体是什么算法吗

有大佬知道pg analyze 对大表采用的random sample算法具体是什么算法吗

python小能手

如何用广播计算多维数组?

我通过双for循环计算元素如下。 N,l=20,10a=np.random.rand(N,l)b=np.random.rand(N,l)r=np.zeros((N,N,l)) for i in range(N): for j in range(N): r[i,j]=a[i]*a[j]*(b[i]-b[j])-a[i]/a[j] 题: 如何对阵列进行矢量化并通过广播进行计算? 我还想设置索引i不等于j,这意味着将对角元素保留为零。我也可以通过矢量化来做到这一点吗?

python小能手

以编程方式在(I)python中制作和保存绘图而不首先在屏幕上渲染它们

这是一个虚拟脚本,可以生成三个图并将它们保存为PDF。 import matplotlib.pyplot as pltimport pandas as pdimport numpy as np df = pd.DataFrame({"A":np.random.normal(100), "B":np.random.chisquare(5, size = 100), "C":np.random.gamma(5,size = 100)}) for i in df.columns: plt.hist(df[i]) plt.savefig(i+".pdf", format = "pdf") plt.close() 我正在使用spyder,它使用IPython。当我运行这个脚本时,三个窗口弹出我然后消失。它有效,但有点烦人。 如何在不在屏幕上呈现的情况下将图形保存为pdf? 我正在寻找类似R的东西 pdf("path/to/plot/name.pdf")commandsdev.off()因为屏幕上没有任何内容呈现,但pdf会被保存。

python小能手

将2D数据分组为x,y中的重叠圆圈

我目前正在处理一个相当大的3D点数据集(x,y,z),并想要一种有效的方法来识别xy平面中一组圆内的哪些点,半径为r和中心(x1,y1) ),其中x1和y1是网格坐标(每个长度为120)。圆圈将重叠,某些点将属于多个圆圈。 因此,输出将是14400个圆(120 * 120)的标识,并且(x,y,z)列表中的哪个点在每个圆中。 import numpy as np def inside_circle(x, y, x0, y0, r): return (x - x0)*(x - x0) + (y - y0)*(y - y0) < r*r x = np.random.random_sample((10000,))y = np.random.random_sample((10000,)) x0 = np.linspace(min(x),max(x),120)y0 = np.linspace(min(y),max(y),120) idx = np.zeros((14400,10000))r = 2count = 0 for i in range(0,120): for j in range(0,120): idx[count,:] = inside_circle(x,y,x0[i],y0[j],r) count = count + 1 其中inside_circle是一个函数,它为半径为r的圆中的每个测试点x,y,z给出一个布尔值为True或False的数组,其中心为x0 [i]和x0 [j] 我的主要问题是,是否有一种比嵌套for循环更有效的方法?

python小能手

用np.tile生成(n,1,2)数组

我想创建n次(1,2)数组,每个数组应该具有相同的元素。首先,我生成n次1 D数组然后我使用循环迭代这些元素并重复每个元素以填充(n,1,2)数组。我的代码如下: import numpy as np def u_vec(): return np.array([np.random.rand(1)]) n=10u1 = np.zeros(n) for i in range(n): u1[i] = u_vec() print(u1) def u_vec1(): u_vec = np.zeros((n, 2,1)) for i in range(len(u1)): u_vec[i] += np.tile(u1[i], (2,1)) return u_vec u = u_vec1()print(u)我得到的输出是 [0.4594466 0.80924903 0.3186138 0.03601917 0.9116031 0.68199505 0.78999837 0.33778259 0.97626521 0.84925156] [[[0.4594466 0.4594466]] [[0. 0. ]] [[0. 0. ]] [[0. 0. ]] [[0. 0. ]] [[0. 0. ]] [[0. 0. ]] [[0. 0. ]] [[0. 0. ]] [[0. 0. ]]]我不明白为什么只有第一个元素被填充,而其他元素被填充为零。我想要的输出 [[[0.4594466 0.4594466]] [[0.3186138 0.3186138]] [[ 0.03601917 0.03601917]] [[ 0.9116031 0.9116031 ]] [[0.68199505 0.68199505]] [[0.78999837 0.78999837]] [[0.33778259 0.33778259]] [[0.97626521 0.97626521]] [[0.84925156 0.84925156]]]]

python小能手

Pandas DataFrame基于多个列的值选择行,这些列的名称在列表中指定

我有以下数据帧: import pandas as pdimport numpy as npds = pd.DataFrame({'z':np.random.binomial(n=1,p=0.5,size=10), 'x':np.random.binomial(n=1,p=0.5,size=10), 'u':np.random.binomial(n=1,p=0.5,size=10), 'y':np.random.binomial(n=1,p=0.5,size=10)}) ds z x u y 0 0 1 0 01 0 1 1 12 1 1 1 13 0 0 1 14 0 0 1 15 0 0 0 06 1 0 1 17 0 1 1 18 1 1 0 09 0 1 1 1如何为列表中指定的变量名选择具有值(0,1)的行? 这是我到目前为止: zs = ['z','x']tf = ds[ds[zs].values == (0,1)]tf现在打印: z x u y 0 0 1 0 00 0 1 0 01 0 1 1 11 0 1 1 12 1 1 1 13 0 0 1 14 0 0 1 15 0 0 0 07 0 1 1 17 0 1 1 18 1 1 0 09 0 1 1 19 0 1 1 1其中显示重复项并且行不正确(行#2 - 1,1,1,1)。

python小能手

在Python中生成包含4列的1 GB文件

我想在python中生成一个具有以下规范的文件: 第1列:Sno第2列:应随机分配为数字1-10第3列 - 第4列:应为长度为1-32的随机字符 我希望这个文件超过1 GB的大小。我目前正在使用此代码: import pandas as pdimport numpy as npimport randomimport stringfrom random import choicesfrom string import ascii_lowercase df = pd.DataFrame(np.random.randint(low=0, high=10, size=(50000000,1)), columns=['integer1']) df['String1']= ["".join(choices(ascii_lowercase, k=random.randint(1,32))) for _ in range(50000000)]df['String2']= ["".join(choices(ascii_lowercase, k=random.randint(1,32))) for _ in range(50000000)]但是这段代码非常慢,耗费了很多时间。有没有更有效的方法呢? 考虑到字符串列,我也找到了元音数量最多的行。 另外,有没有办法不生成5000万(代码中看到的行),但仍然使大小达到1 GB,类似于“反压缩”。

python小能手

Tensorflow text_generation

我正在处理代码https://www.tensorflow.org/tutorials/sequences/text_generation 当我到达该行时,会产生以下错误。 sampled_indices = tf.random.categorical(example_batch_predictions[0], num_samples=1) sampled_indices = tf.squeeze(sampled_indices,axis=-1).numpy()错误 AttributeErrorTraceback (most recent call last)在 ----> 1 sampled_indices = tf.random.categorical(example_batch_predictions[0], num_samples=1)2 sampled_indices = tf.squeeze(sampled_indices,axis=-1).numpy() AttributeError: module 'tensorflow._api.v1.random' has no attribute 'categorical'系统信息 - TensorFlow版本:1.12 Uupntu上的Jupyter NoteBooks

python小能手

Keras - 输入数组应具有与目标数组相同的样本数

我有下面的代码运行生成对抗网络(GAN)374培训大小的图像32x32。 为什么我出现以下错误? ValueError: Input arrays should have the same number of samples as target arrays. Found 7500 input samples and 40 target samples.发生在以下声明中: discriminator_loss = discriminator.train_on_batch(combined_images,labels)import kerasfrom keras import layersimport numpy as npimport cv2import osfrom keras.preprocessing import image latent_dimension = 32height = 32width = 32channels = 3iterations = 100000batch_size = 20real_images = [] paths to the training and results directories train_directory = '/training'results_directory = '/results' GAN generator generator_input = keras.Input(shape=(latent_dimension,)) transform the input into a 16x16 128-channel feature map x = layers.Dense(1281616)(generator_input)x = layers.LeakyReLU()(x)x = layers.Reshape((16,16,128))(x) x = layers.Conv2D(256,5,padding='same')(x)x = layers.LeakyReLU()(x) upsample to 32x32 x = layers.Conv2DTranspose(256,4,strides=2,padding='same')(x)x = layers.LeakyReLU()(x) x = layers.Conv2D(256,5,padding='same')(x)x = layers.LeakyReLU()(x)x = layers.Conv2D(256,5,padding='same')(x)x = layers.LeakyReLU()(x) a 32x32 1-channel feature map is generated (i.e. shape of image) x = layers.Conv2D(channels,7,activation='tanh',padding='same')(x) instantiae the generator model, which maps the input of shape (latent dimension) into an image of shape (32,32,1) generator = keras.models.Model(generator_input,x)generator.summary() GAN discriminator discriminator_input = layers.Input(shape=(height,width,channels)) x = layers.Conv2D(128,3)(discriminator_input)x = layers.LeakyReLU()(x)x = layers.Conv2D(128,4,strides=2)(x)x = layers.LeakyReLU()(x)x = layers.Conv2D(128,4,strides=2)(x)x = layers.LeakyReLU()(x)x = layers.Conv2D(128,4,strides=2)(x)x = layers.LeakyReLU()(x)x = layers.Flatten()(x) dropout layer x = layers.Dropout(0.4)(x) classification layer x = layers.Dense(1,activation='sigmoid')(x) instantiate the discriminator model, and turn a (32,32,1) input into a binary classification decision (fake or real) discriminator = keras.models.Model(discriminator_input,x)discriminator.summary() discriminator_optimizer = keras.optimizers.RMSprop( lr=0.0008, clipvalue=1.0, decay=1e-8) discriminator.compile(optimizer=discriminator_optimizer, loss='binary_crossentropy') adversarial network discriminator.trainable = False gan_input = keras.Input(shape=(latent_dimension,))gan_output = discriminator(generator(gan_input))gan = keras.models.Model(gan_input,gan_output) gan_optimizer = keras.optimizers.RMSprop( lr=0.0004, clipvalue=1.0, decay=1e-8) gan.compile(optimizer=gan_optimizer,loss='binary_crossentropy') start = 0for step in range(iterations): # sample random points in the latent space random_latent_vectors = np.random.normal(size=(batch_size,latent_dimension)) # decode the random latent vectors into fake images generated_images = generator.predict(random_latent_vectors) stop = start + batch_size i = start for root, dirs, files in os.walk(train_directory): for file in files: for i in range(stop-start): img = cv2.imread(root + '/' + file) real_images.append(img) i = i+1 combined_images = np.concatenate([generated_images,real_images]) #assemble labels and discrminate between real and fake images labels = np.concatenate([np.ones((batch_size,1)),np.zeros(batch_size,1)]) #add random noise to the labels labels = labels + 0.05 * np.random.random(labels.shape) #train the discriminator discriminator_loss = discriminator.train_on_batch(combined_images,labels) random_latent_vectors = np.random.normal(size=(batch_size,latent_dimension)) #assemble labels that classify the images as "real", which is not true misleading_targets = np.zeros((batch_size,1)) #train the generator via the GAN model, where the discriminator weights are frozen adversarial_loss = gan.train_on_batch(random_latent_vectors,misleading_targets) start = start + batch_size if start > len(train_directory)-batch_size: start = 0 #save the model weights if step % 100 == 0: gan.save_weights('gan.h5') print'discriminator loss: ' print discriminator_loss print 'adversarial loss: ' print adversarial_loss img = image.array_to_img(generated_images[0] * 255.) img.save(os.path.join(results_directory,'generated_melanoma_image' + str(step) + '.png')) img = image.array_to_img(real_images[0] * 255.) img.save(os.path.join(results_directory,'real_melanoma_image' + str(step) + '.png'))

python小能手

我的Kivy程序在左下角出现了一个随机的白色方块

我正在尝试创建一个输出随机10x10黑白方块网格的程序。除了左下角有一个不需要的白色方块覆盖网格的一部分外,它主要起作用。 我甚至无法弄清楚导致这个问题的小部件。我试过从root开始打印都无济于事。 import randomimport kivykivy.require("1.10.1")from kivy.app import Appfrom kivy.lang import Builderfrom kivy.uix.floatlayout import FloatLayoutfrom kivy.uix.gridlayout import GridLayoutfrom kivy.uix.label import Labelfrom kivy.config import Configfrom kivy.graphics import Colorfrom kivy.graphics import Rectangle Config.set('graphics', 'width', '400')Config.set('graphics', 'height', '400') class Container(FloatLayout): pass class ColorLabel(Label): def __init__(self, **kwargs): super(ColorLabel, self).__init__(**kwargs) with self.canvas: Color(1, 1, 1, 1) self.rect = Rectangle(size=self.size, pos=self.pos) self.bind(size=self._update_rect, pos=self._update_rect) def _update_rect(self, instance, value): self.rect.pos = instance.pos self.rect.size = instance.size def changeBG(self): with self.canvas.after: Color(0,0,0,1) self.rect = Rectangle(size=self.size, pos=self.pos) class Main(App): def build(self): Builder.load_file("EveryImage.kv") the_grid = GridLayout(cols=10, spacing=1) i = 100 while i > 0: i -= 1 newLabel = ColorLabel() the_grid.add_widget(newLabel) x = random.randint(0,1) if x == 0: newLabel.changeBG() root = Container() root.add_widget(the_grid) return root Keep everything below this last! if name == '__main__': Main().run() 这是.kv文件: EveryImage.kv Container: Container holds all the other layouts : id: contain canvas.before: Color: rgba: 0,0,0.5,1 #blue, just for the grid Rectangle: pos: self.pos size: self.size : canvas.before: Color: rgba: 1,1,1,1 #white Rectangle: pos: self.pos size: self.size

社区小助手

Spark如何从一行中仅提取Json数据

我有一堆文件,每行代码如下: some random non json stuff here {"timestmap":21212121, "name":"John"}我无法将这些文件作为json读取,因为Json数据之前存在随机的东西。 清除随机内容以便能够将Json数据加载到具有适当列的DF中的最佳方法是什么? 最终目标是使最终DF只包含时间戳在特定日期之间的数据。

morlory

使用了阿里云的短信服务开发了python的程序,但是在打包为exe的时候出现了问题

————————————————————12.28更新————————————————————现在重新装了pycrypto,但是丢失的文件还是丢失状态(可能他本来就没有这个文件,但是我从它的安装包中找到了那个文件),之后我把安装包中的文件放到了指定位置,但是程序打包时貌似还是找不到,还是报原来的错误,还在寻找解决办法 原问题————————————————————————以下为报错信息:Traceback (most recent call last): File "main.py", line 15, in File "c:python27libsite-packagesPyInstaller-3.4.dev0+ab8fd9753-py2.7.eggPyInstallerloaderpyimod03_importers.py", line 396, in load_module exec(bytecode, module.__dict__) File "login.py", line 10, in File "c:python27libsite-packagesPyInstaller-3.4.dev0+ab8fd9753-py2.7.eggPyInstallerloaderpyimod03_importers.py", line 396, in load_module exec(bytecode, module.__dict__) File "lostcode.py", line 14, in File "c:python27libsite-packagesPyInstaller-3.4.dev0+ab8fd9753-py2.7.eggPyInstallerloaderpyimod03_importers.py", line 396, in load_module exec(bytecode, module.__dict__) File "send_sms.py", line 5, in File "c:python27libsite-packagesPyInstaller-3.4.dev0+ab8fd9753-py2.7.eggPyInstallerloaderpyimod03_importers.py", line 396, in load_module exec(bytecode, module.__dict__) File "libsite-packagesaliyun_python_sdk_core-2.4.4-py2.7.eggaliyunsdkcoreclient.py", line 40, in File "c:python27libsite-packagesPyInstaller-3.4.dev0+ab8fd9753-py2.7.eggPyInstallerloaderpyimod03_importers.py", line 396, in load_module exec(bytecode, module.__dict__) File "libsite-packagesaliyun_python_sdk_core-2.4.4-py2.7.eggaliyunsdkcoreauthSigner.py", line 32, in File "c:python27libsite-packagesPyInstaller-3.4.dev0+ab8fd9753-py2.7.eggPyInstallerloaderpyimod03_importers.py", line 396, in load_module exec(bytecode, module.__dict__) File "libsite-packagesaliyun_python_sdk_core-2.4.4-py2.7.eggaliyunsdkcoreauthalgorithmsha_hmac256.py", line 24, in File "c:python27libsite-packagesPyInstaller-3.4.dev0+ab8fd9753-py2.7.eggPyInstallerloaderpyimod03_importers.py", line 396, in load_module exec(bytecode, module.__dict__) File "buildbdist.win32eggCryptoPublicKeyRSA.py", line 78, in File "c:python27libsite-packagesPyInstaller-3.4.dev0+ab8fd9753-py2.7.eggPyInstallerloaderpyimod03_importers.py", line 396, in load_module exec(bytecode, module.__dict__) File "buildbdist.win32eggCryptoRandom__init__.py", line 28, in File "c:python27libsite-packagesPyInstaller-3.4.dev0+ab8fd9753-py2.7.eggPyInstallerloaderpyimod03_importers.py", line 396, in load_module exec(bytecode, module.__dict__) File "buildbdist.win32eggCryptoRandomOSRNG__init__.py", line 34, in File "c:python27libsite-packagesPyInstaller-3.4.dev0+ab8fd9753-py2.7.eggPyInstallerloaderpyimod03_importers.py", line 396, in load_module exec(bytecode, module.__dict__) File "buildbdist.win32eggCryptoRandomOSRNGnt.py", line 28, in File "c:python27libsite-packagesPyInstaller-3.4.dev0+ab8fd9753-py2.7.eggPyInstallerloaderpyimod03_importers.py", line 396, in load_module exec(bytecode, module.__dict__) File "buildbdist.win32eggCryptoRandomOSRNGwinrandom.py", line 7, in File "buildbdist.win32eggCryptoRandomOSRNGwinrandom.py", line 6, in bootstrapImportError: DLL load failed: 找不到指定的模块。[30668] Failed to execute script main 我使用的的是Python2.7,使用pyinstaller打包的,没打包的程序运行是没有问题的,打包之后出来的exe文件就会报这样的错误,求大神解答

德哥

来测一下你有没有SQL优化基因!

继<震惊!慢SQL居然能优化到这种速度,我不服!>后,又一轮SQL挑战赛来了。这次是《决战紫禁之巅》,不服来战。 场景表述 一张小表A,里面存储了一些ID,大约几百个。 (比如说巡逻车辆ID,环卫车辆的ID,公交车,微公交的ID)。 另外有一张日志表B,每条记录中的ID是来自前面那张小表的,但不是每个ID都出现在这张日志表中,比如说一天可能只有几十个ID会出现在这个日志表的当天的数据中。 (比如车辆的行车轨迹数据,每秒上报轨迹,数据量就非常庞大)。 那么我怎么快速的找出今天没有出现的ID呢。 (哪些巡逻车辆没有出现在这个片区,是不是偷懒了?哪些环卫车辆没有出行,哪些公交或微公交没有出行)? 测试模型和数据 建表 create table a(id int primary key, info text); create table b(id int primary key, aid int, crt_time timestamp); create index b_aid on b(aid); 插入测试数据 -- a表插入1000条 insert into a select generate_series(0,1000), md5(random()::text); -- b表插入1000万条,只包含aid的901个id。 insert into b select generate_series(1,10000000), random()*900, clock_timestamp(); 参考SQL和查询性能 下面两条SQL都是满足查询条件的SQL,由PostgreSQL 10给出。 代表了没有做任何优化的情况下的查询性能。 postgres=# explain (analyze,timing) select * from a where id not in (select aid from b); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Seq Scan on a (cost=179053.25..179074.76 rows=500 width=37) (actual time=4369.478..4369.525 rows=100 loops=1) Filter: (NOT (hashed SubPlan 1)) Rows Removed by Filter: 901 SubPlan 1 -> Seq Scan on b (cost=0.00..154053.60 rows=9999860 width=4) (actual time=0.322..1829.342 rows=10000000 loops=1) Planning time: 0.094 ms Execution time: 4423.364 ms (7 rows) postgres=# explain (analyze,timing) select a.* from a left join b on (a.id=b.aid) where b.* is null; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Hash Right Join (cost=31.52..280244.69 rows=49999 width=37) (actual time=4316.767..4316.790 rows=100 loops=1) Hash Cond: (b.aid = a.id) Filter: (b.* IS NULL) Rows Removed by Filter: 10000000 -> Seq Scan on b (cost=0.00..154053.60 rows=9999860 width=44) (actual time=0.013..2544.321 rows=10000000 loops=1) -> Hash (cost=19.01..19.01 rows=1001 width=37) (actual time=0.342..0.342 rows=1001 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 76kB -> Seq Scan on a (cost=0.00..19.01 rows=1001 width=37) (actual time=0.009..0.137 rows=1001 loops=1) Planning time: 0.173 ms Execution time: 4316.828 ms (10 rows) 等你来战 胜利提示语 请使用 PostgreSQL 黑科技。 越老越有SQL优化基因 《带你走进179个SQL优化场景》

小赵q1

phpMyAdmin - Error:Failed to generate random CSRF token!

我的系统是windows 2012 Service R2,解压了phpmyadmin4.6.6后设置了配置文件中的$cfg['blowfish_secret']的值,然后部署到网站中访问出现错误:phpMyAdmin - ErrorFailed to generate random CSRF token!。请问这是什么原因呢?应该怎么解决呢?我的系统中已经安装并启动了mysql5.7.17,也可以正常访问mysql。